11,421 research outputs found

    Lower Bounds for Oblivious Near-Neighbor Search

    Get PDF
    We prove an Ω(dlgn/(lglgn)2)\Omega(d \lg n/ (\lg\lg n)^2) lower bound on the dynamic cell-probe complexity of statistically oblivious\mathit{oblivious} approximate-near-neighbor search (ANN\mathsf{ANN}) over the dd-dimensional Hamming cube. For the natural setting of d=Θ(logn)d = \Theta(\log n), our result implies an Ω~(lg2n)\tilde{\Omega}(\lg^2 n) lower bound, which is a quadratic improvement over the highest (non-oblivious) cell-probe lower bound for ANN\mathsf{ANN}. This is the first super-logarithmic unconditional\mathit{unconditional} lower bound for ANN\mathsf{ANN} against general (non black-box) data structures. We also show that any oblivious static\mathit{static} data structure for decomposable search problems (like ANN\mathsf{ANN}) can be obliviously dynamized with O(logn)O(\log n) overhead in update and query time, strengthening a classic result of Bentley and Saxe (Algorithmica, 1980).Comment: 28 page

    Connectivity Oracles for Graphs Subject to Vertex Failures

    Full text link
    We introduce new data structures for answering connectivity queries in graphs subject to batched vertex failures. A deterministic structure processes a batch of ddd\leq d_{\star} failed vertices in O~(d3)\tilde{O}(d^3) time and thereafter answers connectivity queries in O(d)O(d) time. It occupies space O(dmlogn)O(d_{\star} m\log n). We develop a randomized Monte Carlo version of our data structure with update time O~(d2)\tilde{O}(d^2), query time O(d)O(d), and space O~(m)\tilde{O}(m) for any failure bound dnd\le n. This is the first connectivity oracle for general graphs that can efficiently deal with an unbounded number of vertex failures. We also develop a more efficient Monte Carlo edge-failure connectivity oracle. Using space O(nlog2n)O(n\log^2 n), dd edge failures are processed in O(dlogdloglogn)O(d\log d\log\log n) time and thereafter, connectivity queries are answered in O(loglogn)O(\log\log n) time, which are correct w.h.p. Our data structures are based on a new decomposition theorem for an undirected graph G=(V,E)G=(V,E), which is of independent interest. It states that for any terminal set UVU\subseteq V we can remove a set BB of U/(s2)|U|/(s-2) vertices such that the remaining graph contains a Steiner forest for UBU-B with maximum degree ss

    Fundamental Limits on Data Acquisition: Trade-offs between Sample Complexity and Query Difficulty

    Full text link
    We consider query-based data acquisition and the corresponding information recovery problem, where the goal is to recover kk binary variables (information bits) from parity measurements of those variables. The queries and the corresponding parity measurements are designed using the encoding rule of Fountain codes. By using Fountain codes, we can design potentially limitless number of queries, and corresponding parity measurements, and guarantee that the original kk information bits can be recovered with high probability from any sufficiently large set of measurements of size nn. In the query design, the average number of information bits that is associated with one parity measurement is called query difficulty (dˉ\bar{d}) and the minimum number of measurements required to recover the kk information bits for a fixed dˉ\bar{d} is called sample complexity (nn). We analyze the fundamental trade-offs between the query difficulty and the sample complexity, and show that the sample complexity of n=cmax{k,(klogk)/dˉ}n=c\max\{k,(k\log k)/\bar{d}\} for some constant c>0c>0 is necessary and sufficient to recover kk information bits with high probability as kk\to\infty

    Towards a compact representation of temporal rasters

    Get PDF
    Big research efforts have been devoted to efficiently manage spatio-temporal data. However, most works focused on vectorial data, and much less, on raster data. This work presents a new representation for raster data that evolve along time named Temporal k^2 raster. It faces the two main issues that arise when dealing with spatio-temporal data: the space consumption and the query response times. It extends a compact data structure for raster data in order to manage time and thus, it is possible to query it directly in compressed form, instead of the classical approach that requires a complete decompression before any manipulation. In addition, in the same compressed space, the new data structure includes two indexes: a spatial index and an index on the values of the cells, thus becoming a self-index for raster data.Comment: This research has received funding from the European Union's Horizon 2020 research and innovation programme under the Marie Sklodowska-Curie Actions H2020-MSCA-RISE-2015 BIRDS GA No. 690941. Published in SPIRE 201

    Crossing the Logarithmic Barrier for Dynamic Boolean Data Structure Lower Bounds

    Full text link
    This paper proves the first super-logarithmic lower bounds on the cell probe complexity of dynamic boolean (a.k.a. decision) data structure problems, a long-standing milestone in data structure lower bounds. We introduce a new method for proving dynamic cell probe lower bounds and use it to prove a Ω~(log1.5n)\tilde{\Omega}(\log^{1.5} n) lower bound on the operational time of a wide range of boolean data structure problems, most notably, on the query time of dynamic range counting over F2\mathbb{F}_2 ([Pat07]). Proving an ω(lgn)\omega(\lg n) lower bound for this problem was explicitly posed as one of five important open problems in the late Mihai P\v{a}tra\c{s}cu's obituary [Tho13]. This result also implies the first ω(lgn)\omega(\lg n) lower bound for the classical 2D range counting problem, one of the most fundamental data structure problems in computational geometry and spatial databases. We derive similar lower bounds for boolean versions of dynamic polynomial evaluation and 2D rectangle stabbing, and for the (non-boolean) problems of range selection and range median. Our technical centerpiece is a new way of "weakly" simulating dynamic data structures using efficient one-way communication protocols with small advantage over random guessing. This simulation involves a surprising excursion to low-degree (Chebychev) polynomials which may be of independent interest, and offers an entirely new algorithmic angle on the "cell sampling" method of Panigrahy et al. [PTW10]

    LRM-Trees: Compressed Indices, Adaptive Sorting, and Compressed Permutations

    Full text link
    LRM-Trees are an elegant way to partition a sequence of values into sorted consecutive blocks, and to express the relative position of the first element of each block within a previous block. They were used to encode ordinal trees and to index integer arrays in order to support range minimum queries on them. We describe how they yield many other convenient results in a variety of areas, from data structures to algorithms: some compressed succinct indices for range minimum queries; a new adaptive sorting algorithm; and a compressed succinct data structure for permutations supporting direct and indirect application in time all the shortest as the permutation is compressible.Comment: 13 pages, 1 figur
    corecore