Search CORE

25 research outputs found

Decompressing Lempel-Ziv Compressed Text

Author: Bille Philip
Ettienne Mikko Berggren
Gagie Travis
Gørtz Inge Li
Prezza Nicola
Publication venue
Publication date: 04/11/2019
Field of study

We consider the problem of decompressing the Lempel--Ziv 77 representation of a string

S

of length

n

using a working space as close as possible to the size

z

of the input. The folklore solution for the problem runs in

O(n)

time but requires random access to the whole decompressed text. Another folklore solution is to convert LZ77 into a grammar of size

O(z\log(n/z))

and then stream

S

in linear time. In this paper, we show that

O(n)

time and

O(z)

working space can be achieved for constant-size alphabets. On general alphabets of size

\sigma

, we describe (i) a trade-off achieving

O(n\log^\delta \sigma)

time and

O(z\log^{1-\delta}\sigma)

space for any

0\leq \delta\leq 1

, and (ii) a solution achieving

O(n)

time and

O(z\log\log (n/z))

space. The latter solution, in particular, dominates both folklore algorithms for the problem. Our solutions can, more generally, extract any specified subsequence of

S

with little overheads on top of the linear running time and working space. As an immediate corollary, we show that our techniques yield improved results for pattern matching problems on LZ77-compressed text

arXiv.org e-Print Archive

Crossref

Archivio istituzionale della ricerca - Università degli Studi di Venezia Ca' Foscari

Archivio della ricerca- LUISS Libera Università Internazionale degli Studi Sociali Guido Carli di Roma

Online Research Database In Technology

Mergeable Dictionaries

Author: Iacono John
Publication venue: Dagstuhl Seminar Proceedings. 10091 - Data Structures
Publication date: 01/01/2010
Field of study

A data structure is presented for the Mergeable Dictionary abstract data type, which supports the following operations on a collection of disjoint sets of totally ordered data: Predecessor-Search, Split and Union. While Predecessor-Search and Split work in the normal way, the novel operation is Union. While in a typical mergeable dictionary (e.g. 2-4 Trees), the Union operation can only be performed on sets that span disjoint intervals in keyspace, the structure here has no such limitation, and permits the merging of arbitrarily interleaved sets. Our data structure supports all operations, including Union, in O(log n) amortized time, thus showing that interleaved Union operations can be supported at no additional cost vis-a-vis disjoint Union operations

Dagstuhl Research Online Publication Server

Compressed and efficient algorithms and data structures for strings

Author: Ettienne Mikko Berggren
Publication venue: DTU Compute
Publication date: 01/01/2018
Field of study

Online Research Database In Technology

15th Scandinavian Symposium and Workshops on Algorithm Theory: SWAT 2016, June 22-24, 2016, Reykjavik, Iceland

Author
Publication venue: Leibniz-Zentrum für Informatik GmbH, Dagstuhl Publishing
Publication date: 01/06/2016
Field of study

Digitale Bibliothek Thüringen

Thirty nine years of stratified trees

Author: van Emde Boas Peter
Publication venue
Publication date: 19/12/2013
Field of study

The stratified tree, also called van Emde Boas tree, is a data structure implementing the full repertoire of instructions manipulating a single subset

A

of a finite ordered Universe

U = [0 ... u-1]

. Instructions include

member

insert

delete

min

max

predecessor

and

successor

, as well as composite ones like

extract-min

. The processing time per instruction is

O(loglog(u))

. Hence it improves upon the traditional comparison based tree structures for dense subsets

A

; if

A

is sparse, meaning that the size n = # A = O(log(u)) the improvement vanishes. Examples exist where this improvement helps to speed-up algorithmic solutions of real problems; such applications can be found for example in graph algorithms, computational geometry and forwarding of packets on the internet. The structure was invented during a short postdoc residence at Cornell University in the fall of 1974. In the sequel of this paper I will use the original name Stratified Trees which was used in my own papers on this data structure. There are two strategies for understanding how this

O(loglog(u))

improvement can be obtained. Today a direct recursive approach is used where the universe is divided into a cluster of

sqrt{u}

galaxies each of size

sqrt{u}

; the set manipulation instructions decompose accordingly in a instruction at the cluster and galaxy level, but one of these two instructions is always of a special trivial type. The processing complexity thus satisfies a recurrence

T(u) = T(sqrt{u}) + O(1)

. Consequently

T(u) = O(loglog(u))

. However, this recursive approach requires address calculations on the arguments which use multiplicative arithmetical instructions. These instructions are not allowed in the Random Access Machine model (RAM) which was the standard model in the developing research area of design and analysis of algorithms in 1974. Therefore the early implementations of the stratified trees are based on a different approach which best is described as a binary-search-on-levels strategy. In this approach the address calculations are not required, and the structure can be implemented using pointers. The downside of this approach is that it leads to rather complex algorithms, which are still hard to present correctly even today. Another bad consequence was the super linear space consumption of the data structure, which was only eliminated three years later. In this paper I want to describe the historical backgrounds against which the stratified trees were discovered and implemented. I do not give complete code fragments implementing the data structure and the operations; they can be found in the various textbooks and papers mentioned, including a Wikipedia page. Code fragments appearing in this paper are copied verbatim from the original sources; the same holds for the figures

Epoka University

The 1st Conference of PhD Students in Computer Science

Author
Publication venue
Publication date: 01/01/1998
Field of study

University of Szeged