51 research outputs found

    Simple and Efficient Fully-Functional Succinct Trees

    Full text link
    The fully-functional succinct tree representation of Navarro and Sadakane (ACM Transactions on Algorithms, 2014) supports a large number of operations in constant time using 2n+o(n)2n+o(n) bits. However, the full idea is hard to implement. Only a simplified version with O(logn)O(\log n) operation time has been implemented and shown to be practical and competitive. We describe a new variant of the original idea that is much simpler to implement and has worst-case time O(loglogn)O(\log\log n) for the operations. An implementation based on this version is experimentally shown to be superior to existing implementations

    Dynamic Data Structures for Document Collections and Graphs

    Full text link
    In the dynamic indexing problem, we must maintain a changing collection of text documents so that we can efficiently support insertions, deletions, and pattern matching queries. We are especially interested in developing efficient data structures that store and query the documents in compressed form. All previous compressed solutions to this problem rely on answering rank and select queries on a dynamic sequence of symbols. Because of the lower bound in [Fredman and Saks, 1989], answering rank queries presents a bottleneck in compressed dynamic indexing. In this paper we show how this lower bound can be circumvented using our new framework. We demonstrate that the gap between static and dynamic variants of the indexing problem can be almost closed. Our method is based on a novel framework for adding dynamism to static compressed data structures. Our framework also applies more generally to dynamizing other problems. We show, for example, how our framework can be applied to develop compressed representations of dynamic graphs and binary relations

    Succinct Indexable Dictionaries with Applications to Encoding kk-ary Trees, Prefix Sums and Multisets

    Full text link
    We consider the {\it indexable dictionary} problem, which consists of storing a set S{0,...,m1}S \subseteq \{0,...,m-1\} for some integer mm, while supporting the operations of \Rank(x), which returns the number of elements in SS that are less than xx if xSx \in S, and -1 otherwise; and \Select(i) which returns the ii-th smallest element in SS. We give a data structure that supports both operations in O(1) time on the RAM model and requires B(n,m)+o(n)+O(lglgm){\cal B}(n,m) + o(n) + O(\lg \lg m) bits to store a set of size nn, where {\cal B}(n,m) = \ceil{\lg {m \choose n}} is the minimum number of bits required to store any nn-element subset from a universe of size mm. Previous dictionaries taking this space only supported (yes/no) membership queries in O(1) time. In the cell probe model we can remove the O(lglgm)O(\lg \lg m) additive term in the space bound, answering a question raised by Fich and Miltersen, and Pagh. We present extensions and applications of our indexable dictionary data structure, including: An information-theoretically optimal representation of a kk-ary cardinal tree that supports standard operations in constant time, A representation of a multiset of size nn from {0,...,m1}\{0,...,m-1\} in B(n,m+n)+o(n){\cal B}(n,m+n) + o(n) bits that supports (appropriate generalizations of) \Rank and \Select operations in constant time, and A representation of a sequence of nn non-negative integers summing up to mm in B(n,m+n)+o(n){\cal B}(n,m+n) + o(n) bits that supports prefix sum queries in constant time.Comment: Final version of SODA 2002 paper; supersedes Leicester Tech report 2002/1

    Succinct Representations of Dynamic Strings

    Full text link
    The rank and select operations over a string of length n from an alphabet of size σ\sigma have been used widely in the design of succinct data structures. In many applications, the string itself need be maintained dynamically, allowing characters of the string to be inserted and deleted. Under the word RAM model with word size w=Ω(lgn)w=\Omega(\lg n), we design a succinct representation of dynamic strings using nH0+o(n)lgσ+O(w)nH_0 + o(n)\lg\sigma + O(w) bits to support rank, select, insert and delete in O(lgnlglgn(lgσlglgn+1))O(\frac{\lg n}{\lg\lg n}(\frac{\lg \sigma}{\lg\lg n}+1)) time. When the alphabet size is small, i.e. when \sigma = O(\polylog (n)), including the case in which the string is a bit vector, these operations are supported in O(lgnlglgn)O(\frac{\lg n}{\lg\lg n}) time. Our data structures are more efficient than previous results on the same problem, and we have applied them to improve results on the design and construction of space-efficient text indexes

    SPACE EFFICIENT STRUCTURES FOR JSON DOCUMENTS

    Get PDF
    ABSTRACT With the rapid increase of JSON documents on the web, methods to index, store and retrieve these documents has become a very significant problem. Implementations that load JSON documents and give access to them, suffer from huge memory demands. The in-memory representation of JSON documents is larger than its file size. This is a problem for machines with limited memory such as mobile devices, where processing even moderately-sized JSON documents requires more memory than is available. . Both JSON data and the existing queries possess an inherent tree structure and thus fast child-parent lookup is a necessity to improve performance. We focus on in-memory representations of JSON documents for situations where space is limited and where rapid processing time is important. With this paper, we hope to spark a discussion on the application of succinct data structures that supports operations on the document tree at speed comparable with an in-memory deserialized object, thus bridging textual formats with binary formats

    Compressed Data Structures for Dynamic Sequences

    Full text link
    We consider the problem of storing a dynamic string SS over an alphabet Σ={1,,σ}\Sigma=\{\,1,\ldots,\sigma\,\} in compressed form. Our representation supports insertions and deletions of symbols and answers three fundamental queries: access(i,S)\mathrm{access}(i,S) returns the ii-th symbol in SS, ranka(i,S)\mathrm{rank}_a(i,S) counts how many times a symbol aa occurs among the first ii positions in SS, and selecta(i,S)\mathrm{select}_a(i,S) finds the position where a symbol aa occurs for the ii-th time. We present the first fully-dynamic data structure for arbitrarily large alphabets that achieves optimal query times for all three operations and supports updates with worst-case time guarantees. Ours is also the first fully-dynamic data structure that needs only nHk+o(nlogσ)nH_k+o(n\log\sigma) bits, where HkH_k is the kk-th order entropy and nn is the string length. Moreover our representation supports extraction of a substring S[i..i+]S[i..i+\ell] in optimal O(logn/loglogn+/logσn)O(\log n/\log\log n + \ell/\log_{\sigma}n) time
    corecore