25 research outputs found
Decompressing Lempel-Ziv Compressed Text
We consider the problem of decompressing the Lempel--Ziv 77 representation of
a string of length using a working space as close as possible to the
size of the input. The folklore solution for the problem runs in
time but requires random access to the whole decompressed text. Another
folklore solution is to convert LZ77 into a grammar of size and
then stream in linear time. In this paper, we show that time and
working space can be achieved for constant-size alphabets. On general
alphabets of size , we describe (i) a trade-off achieving
time and space for any
, and (ii) a solution achieving time and
space. The latter solution, in particular, dominates both
folklore algorithms for the problem. Our solutions can, more generally, extract
any specified subsequence of with little overheads on top of the linear
running time and working space. As an immediate corollary, we show that our
techniques yield improved results for pattern matching problems on
LZ77-compressed text
Mergeable Dictionaries
A data structure is presented for the Mergeable Dictionary
abstract data type, which supports the following operations on
a collection of disjoint sets of totally ordered data: Predecessor-Search, Split and Union. While Predecessor-Search and Split
work in the normal way, the novel operation is Union. While in
a typical mergeable dictionary (e.g. 2-4 Trees), the Union operation can only be performed on sets that span disjoint intervals in
keyspace, the structure here has no such limitation, and permits
the merging of arbitrarily interleaved sets. Our data structure
supports all operations, including Union, in O(log n) amortized
time, thus showing that interleaved Union operations can be supported at no additional cost vis-a-vis disjoint Union operations
Thirty nine years of stratified trees
The stratified tree, also called van Emde Boas tree, is a data structure implementing the full repertoire of instructions manipulating a single subset of a finite ordered Universe . Instructions include , , , , , and , as well as composite ones like . The processing time per instruction is . Hence it improves upon the traditional comparison based tree structures for dense subsets ; if is sparse, meaning that the size n = # A = O(log(u)) the improvement vanishes.
Examples exist where this improvement helps to speed-up algorithmic solutions of real problems; such applications can be found for example in graph algorithms, computational geometry and forwarding of packets on the internet.
The structure was invented during a short postdoc residence at Cornell University in the fall of 1974. In the sequel of this paper I will use the original name Stratified Trees which was used in my own papers on this data structure.
There are two strategies for understanding how this improvement can be obtained. Today a direct recursive approach is used where the universe is divided into a cluster of galaxies each of size ; the set manipulation instructions decompose accordingly in a instruction at the cluster and galaxy level, but one of these two instructions is always of a special trivial type. The processing complexity thus satisfies a recurrence . Consequently .
However, this recursive approach requires address calculations on the arguments which use multiplicative arithmetical instructions. These instructions are not allowed in the Random Access Machine model (RAM) which was the standard model in the developing research area of design and analysis of algorithms in 1974. Therefore the early implementations of the stratified trees are based on a different approach which best is described as a binary-search-on-levels strategy. In this approach the address calculations are not required, and the structure can be implemented using pointers. The downside of this approach is that it leads to rather complex algorithms, which are still hard to present correctly even today. Another bad consequence was the super linear space consumption of the data structure, which was only eliminated three years later.
In this paper I want to describe the historical backgrounds against which the stratified trees were discovered and implemented. I do not give complete code fragments implementing the data structure and the operations; they can be found in the various textbooks and papers mentioned, including a Wikipedia page. Code fragments appearing in this paper are copied verbatim from the original sources; the same holds for the figures