125,596 research outputs found

    Dynamic Range Majority Data Structures

    Full text link
    Given a set PP of coloured points on the real line, we study the problem of answering range α\alpha-majority (or "heavy hitter") queries on PP. More specifically, for a query range QQ, we want to return each colour that is assigned to more than an α\alpha-fraction of the points contained in QQ. We present a new data structure for answering range α\alpha-majority queries on a dynamic set of points, where α(0,1)\alpha \in (0,1). Our data structure uses O(n) space, supports queries in O((lgn)/α)O((\lg n) / \alpha) time, and updates in O((lgn)/α)O((\lg n) / \alpha) amortized time. If the coordinates of the points are integers, then the query time can be improved to O(lgn/(αlglgn)+(lg(1/α))/α))O(\lg n / (\alpha \lg \lg n) + (\lg(1/\alpha))/\alpha)). For constant values of α\alpha, this improved query time matches an existing lower bound, for any data structure with polylogarithmic update time. We also generalize our data structure to handle sets of points in d-dimensions, for d2d \ge 2, as well as dynamic arrays, in which each entry is a colour.Comment: 16 pages, Preliminary version appeared in ISAAC 201

    External Memory Three-Sided Range Reporting and Top-k Queries with Sublogarithmic Updates

    Get PDF
    An external memory data structure is presented for maintaining a dynamic set of N two-dimensional points under the insertion and deletion of points, and supporting unsorted 3-sided range reporting queries and top-k queries, where top-k queries report the k points with highest y-value within a given x-range. For any constant 0 < epsilon <= 1/2, a data structure is constructed that supports updates in amortized O(1/(epsilon * B^{1-epsilon}) * log_B(N)) IOs and queries in amortized O(1/epsilon * log_B(N+K/B)) IOs, where B is the external memory block size, and K is the size of the output to the query (for top-k queries K is the minimum of k and the number of points in the query interval). The data structure uses linear space. The update bound is a significant factor B^{1-epsilon} improvement over the previous best update bounds for these two query problems, while staying within the same query and space bounds

    GPU LSM: A Dynamic Dictionary Data Structure for the GPU

    Full text link
    We develop a dynamic dictionary data structure for the GPU, supporting fast insertions and deletions, based on the Log Structured Merge tree (LSM). Our implementation on an NVIDIA K40c GPU has an average update (insertion or deletion) rate of 225 M elements/s, 13.5x faster than merging items into a sorted array. The GPU LSM supports the retrieval operations of lookup, count, and range query operations with an average rate of 75 M, 32 M and 23 M queries/s respectively. The trade-off for the dynamic updates is that the sorted array is almost twice as fast on retrievals. We believe that our GPU LSM is the first dynamic general-purpose dictionary data structure for the GPU.Comment: 11 pages, accepted to appear on the Proceedings of IEEE International Parallel and Distributed Processing Symposium (IPDPS'18

    On Approximate Range Mode and Range Selection

    Get PDF
    For any epsilon in (0,1), a (1+epsilon)-approximate range mode query asks for the position of an element whose frequency in the query range is at most a factor (1+epsilon) smaller than the true mode. For this problem, we design a data structure occupying O(n/epsilon) bits of space to answer queries in O(lg(1/epsilon)) time. This is an encoding data structure which does not require access to the input sequence; the space cost of this structure is asymptotically optimal for constant epsilon as we also prove a matching lower bound. Furthermore, our solution improves the previous best result of Greve et al. (Cell Probe Lower Bounds and Approximations for Range Mode, ICALP\u2710) by saving the space cost by a factor of lg n while achieving the same query time. In dynamic settings, we design an O(n)-word data structure that answers queries in O(lg n /lg lg n) time and supports insertions and deletions in O(lg n) time, for any constant epsilon in (0,1); the bounds for non-constant epsilon = o(1) are also given in the paper. This is the first result on dynamic approximate range mode; it can also be used to obtain the first static data structure for approximate 3-sided range mode queries in two dimensions. Another problem we consider is approximate range selection. For any alpha in (0,1/2), an alpha-approximate range selection query asks for the position of an element whose rank in the query range is in [k - alpha s, k + alpha s], where k is a rank given by the query and s is the size of the query range. When alpha is a constant, we design an O(n)-bit encoding data structure that can answer queries in constant time and prove this space cost is asymptotically optimal. The previous best result by Krizanc et al. (Range Mode and Range Median Queries on Lists and Trees, Nordic Journal of Computing, 2005) uses O(n lg n) bits, or O(n) words, to achieve constant approximation for range median only. Thus we not only improve the space cost, but also provide support for any arbitrary k given at query time. We also analyse our solutions for non-constant alpha

    Improved Time and Space Bounds for Dynamic Range Mode

    Get PDF
    Given an array A of n elements, we wish to support queries for the most frequent and least frequent element in a subrange [l, r] of A. We also wish to support updates that change a particular element at index i or insert/ delete an element at index i. For the range mode problem, our data structure supports all operations in O(n^{2/3}) deterministic time using only O(n) space. This improves two results by Chan et al. [Timothy M. Chan et al., 2014]: a linear space data structure supporting update and query operations in O~(n^{3/4}) time and an O(n^{4/3}) space data structure supporting update and query operations in O~(n^{2/3}) time. For the range least frequent problem, we address two variations. In the first, we are allowed to answer with an element of A that may not appear in the query range, and in the second, the returned element must be present in the query range. For the first variation, we develop a data structure that supports queries in O~(n^{2/3}) time, updates in O(n^{2/3}) time, and occupies O(n) space. For the second variation, we develop a Monte Carlo data structure that supports queries in O(n^{2/3}) time, updates in O~(n^{2/3}) time, and occupies O~(n) space, but requires that updates are made independently of the results of previous queries. The Monte Carlo data structure is also capable of answering k-frequency queries; that is, the problem of finding an element of given frequency in the specified query range. Previously, no dynamic data structures were known for least frequent element or k-frequency queries

    I/O-Efficient Dynamic Planar Range Skyline Queries

    Get PDF
    We present the first fully dynamic worst case I/O-efficient data structures that support planar orthogonal \textit{3-sided range skyline reporting queries} in \bigO (\log_{2B^\epsilon} n + \frac{t}{B^{1-\epsilon}}) I/Os and updates in \bigO (\log_{2B^\epsilon} n) I/Os, using \bigO (\frac{n}{B^{1-\epsilon}}) blocks of space, for nn input planar points, tt reported points, and parameter 0ϵ10 \leq \epsilon \leq 1. We obtain the result by extending Sundar's priority queues with attrition to support the operations \textsc{DeleteMin} and \textsc{CatenateAndAttrite} in \bigO (1) worst case I/Os, and in \bigO(1/B) amortized I/Os given that a constant number of blocks is already loaded in main memory. Finally, we show that any pointer-based static data structure that supports \textit{dominated maxima reporting queries}, namely the difficult special case of 4-sided skyline queries, in \bigO(\log^{\bigO(1)}n +t) worst case time must occupy Ω(nlognloglogn)\Omega(n \frac{\log n}{\log \log n}) space, by adapting a similar lower bounding argument for planar 4-sided range reporting queries.Comment: Submitted to SODA 201

    Cell-Probe Lower Bounds from Online Communication Complexity

    Full text link
    In this work, we introduce an online model for communication complexity. Analogous to how online algorithms receive their input piece-by-piece, our model presents one of the players, Bob, his input piece-by-piece, and has the players Alice and Bob cooperate to compute a result each time before the next piece is revealed to Bob. This model has a closer and more natural correspondence to dynamic data structures than classic communication models do, and hence presents a new perspective on data structures. We first present a tight lower bound for the online set intersection problem in the online communication model, demonstrating a general approach for proving online communication lower bounds. The online communication model prevents a batching trick that classic communication complexity allows, and yields a stronger lower bound. We then apply the online communication model to prove data structure lower bounds for two dynamic data structure problems: the Group Range problem and the Dynamic Connectivity problem for forests. Both of the problems admit a worst case O(logn)O(\log n)-time data structure. Using online communication complexity, we prove a tight cell-probe lower bound for each: spending o(logn)o(\log n) (even amortized) time per operation results in at best an exp(δ2n)\exp(-\delta^2 n) probability of correctly answering a (1/2+δ)(1/2+\delta)-fraction of the nn queries

    Amortized Dynamic Cell-Probe Lower Bounds from Four-Party Communication

    Full text link
    This paper develops a new technique for proving amortized, randomized cell-probe lower bounds on dynamic data structure problems. We introduce a new randomized nondeterministic four-party communication model that enables "accelerated", error-preserving simulations of dynamic data structures. We use this technique to prove an Ω(n(logn/loglogn)2)\Omega(n(\log n/\log\log n)^2) cell-probe lower bound for the dynamic 2D weighted orthogonal range counting problem (2D-ORC) with n/polylognn/\mathrm{poly}\log n updates and nn queries, that holds even for data structures with exp(Ω~(n))\exp(-\tilde{\Omega}(n)) success probability. This result not only proves the highest amortized lower bound to date, but is also tight in the strongest possible sense, as a matching upper bound can be obtained by a deterministic data structure with worst-case operational time. This is the first demonstration of a "sharp threshold" phenomenon for dynamic data structures. Our broader motivation is that cell-probe lower bounds for exponentially small success facilitate reductions from dynamic to static data structures. As a proof-of-concept, we show that a slightly strengthened version of our lower bound would imply an Ω((logn/loglogn)2)\Omega((\log n /\log\log n)^2) lower bound for the static 3D-ORC problem with O(nlogO(1)n)O(n\log^{O(1)}n) space. Such result would give a near quadratic improvement over the highest known static cell-probe lower bound, and break the long standing Ω(logn)\Omega(\log n) barrier for static data structures
    corecore