25 research outputs found

    Decompressing Lempel-Ziv Compressed Text

    Full text link
    We consider the problem of decompressing the Lempel--Ziv 77 representation of a string SS of length nn using a working space as close as possible to the size zz of the input. The folklore solution for the problem runs in O(n)O(n) time but requires random access to the whole decompressed text. Another folklore solution is to convert LZ77 into a grammar of size O(zlog(n/z))O(z\log(n/z)) and then stream SS in linear time. In this paper, we show that O(n)O(n) time and O(z)O(z) working space can be achieved for constant-size alphabets. On general alphabets of size σ\sigma, we describe (i) a trade-off achieving O(nlogδσ)O(n\log^\delta \sigma) time and O(zlog1δσ)O(z\log^{1-\delta}\sigma) space for any 0δ10\leq \delta\leq 1, and (ii) a solution achieving O(n)O(n) time and O(zloglog(n/z))O(z\log\log (n/z)) space. The latter solution, in particular, dominates both folklore algorithms for the problem. Our solutions can, more generally, extract any specified subsequence of SS with little overheads on top of the linear running time and working space. As an immediate corollary, we show that our techniques yield improved results for pattern matching problems on LZ77-compressed text

    Mergeable Dictionaries

    Get PDF
    A data structure is presented for the Mergeable Dictionary abstract data type, which supports the following operations on a collection of disjoint sets of totally ordered data: Predecessor-Search, Split and Union. While Predecessor-Search and Split work in the normal way, the novel operation is Union. While in a typical mergeable dictionary (e.g. 2-4 Trees), the Union operation can only be performed on sets that span disjoint intervals in keyspace, the structure here has no such limitation, and permits the merging of arbitrarily interleaved sets. Our data structure supports all operations, including Union, in O(log n) amortized time, thus showing that interleaved Union operations can be supported at no additional cost vis-a-vis disjoint Union operations

    15th Scandinavian Symposium and Workshops on Algorithm Theory: SWAT 2016, June 22-24, 2016, Reykjavik, Iceland

    Get PDF

    Thirty nine years of stratified trees

    Get PDF
    The stratified tree, also called van Emde Boas tree, is a data structure implementing the full repertoire of instructions manipulating a single subset AAof a finite ordered Universe U=[0...u1]U = [0 ... u-1]. Instructions include membermember, insertinsert, deletedelete, minmin, maxmax, predecessorpredecessor and successorsuccessor, as well as composite ones like extractminextract-min. The processing time per instruction is O(loglog(u))O(loglog(u)). Hence it improves upon the traditional comparison based tree structures for dense subsets AA; if AA is sparse, meaning that the size n = # A = O(log(u)) the improvement vanishes. Examples exist where this improvement helps to speed-up algorithmic solutions of real problems; such applications can be found for example in graph algorithms, computational geometry and forwarding of packets on the internet. The structure was invented during a short postdoc residence at Cornell University in the fall of 1974. In the sequel of this paper I will use the original name Stratified Trees which was used in my own papers on this data structure. There are two strategies for understanding how this O(loglog(u))O(loglog(u)) improvement can be obtained. Today a direct recursive approach is used where the universe is divided into a cluster of sqrtusqrt{u} galaxies each of size sqrtusqrt{u}; the set manipulation instructions decompose accordingly in a instruction at the cluster and galaxy level, but one of these two instructions is always of a special trivial type. The processing complexity thus satisfies a recurrence T(u)=T(sqrtu)+O(1)T(u) = T(sqrt{u}) + O(1). Consequently T(u)=O(loglog(u))T(u) = O(loglog(u)). However, this recursive approach requires address calculations on the arguments which use multiplicative arithmetical instructions. These instructions are not allowed in the Random Access Machine model (RAM) which was the standard model in the developing research area of design and analysis of algorithms in 1974. Therefore the early implementations of the stratified trees are based on a different approach which best is described as a binary-search-on-levels strategy. In this approach the address calculations are not required, and the structure can be implemented using pointers. The downside of this approach is that it leads to rather complex algorithms, which are still hard to present correctly even today. Another bad consequence was the super linear space consumption of the data structure, which was only eliminated three years later. In this paper I want to describe the historical backgrounds against which the stratified trees were discovered and implemented. I do not give complete code fragments implementing the data structure and the operations; they can be found in the various textbooks and papers mentioned, including a Wikipedia page. Code fragments appearing in this paper are copied verbatim from the original sources; the same holds for the figures

    The 1st Conference of PhD Students in Computer Science

    Get PDF
    corecore