Search CORE

54 research outputs found

New Algorithms and Lower Bounds for Sequential-Access Data Compression

Author: Gagie Travis
Publication venue
Publication date: 01/01/2009
Field of study

This thesis concerns sequential-access data compression, i.e., by algorithms that read the input one or more times from beginning to end. In one chapter we consider adaptive prefix coding, for which we must read the input character by character, outputting each character's self-delimiting codeword before reading the next one. We show how to encode and decode each character in constant worst-case time while producing an encoding whose length is worst-case optimal. In another chapter we consider one-pass compression with memory bounded in terms of the alphabet size and context length, and prove a nearly tight tradeoff between the amount of memory we can use and the quality of the compression we can achieve. In a third chapter we consider compression in the read/write streams model, which allows us passes and memory both polylogarithmic in the size of the input. We first show how to achieve universal compression using only one pass over one stream. We then show that one stream is not sufficient for achieving good grammar-based compression. Finally, we show that two streams are necessary and sufficient for achieving entropy-only bounds.Comment: draft of PhD thesi

arXiv.org e-Print Archive

Publications at Bielefeld University

Radix Sorting With No Extra Space

Author: Franceschini Gianni
Muthukrishnan S.
Patrascu Mihai
Publication venue
Publication date: 01/01/2007
Field of study

It is well known that n integers in the range [1,n^c] can be sorted in O(n) time in the RAM model using radix sorting. More generally, integers in any range [1,U] can be sorted in O(n sqrt{loglog n}) time. However, these algorithms use O(n) words of extra memory. Is this necessary? We present a simple, stable, integer sorting algorithm for words of size O(log n), which works in O(n) time and uses only O(1) words of extra memory on a RAM model. This is the integer sorting case most useful in practice. We extend this result with same bounds to the case when the keys are read-only, which is of theoretical interest. Another interesting question is the case of arbitrary c. Here we present a black-box transformation from any RAM sorting algorithm to a sorting algorithm which uses only O(1) extra space and has the same running time. This settles the complexity of in-place sorting in terms of the complexity of sorting.Comment: Full version of paper accepted to ESA 2007. (17 pages

arXiv.org e-Print Archive

Archivio della ricerca- Università di Roma La Sapienza

Selection from read-only memory with limited workspace

Author: A. Golynski
B. Chazelle
D.E. Knuth
G. Jacobson
G. Navarro
G.N. Frederickson
J. Pagter
J.I. Munro
J.I. Munro
J.I. Munro
M. Blum
P. Beame
R. Grossi
R. Raman
T. Asano
T.H. Cormen
T.M. Chan
V. Raman
Publication venue
Publication date: 01/01/2013
Field of study

Given an unordered array of

N

elements drawn from a totally ordered set and an integer

k

in the range from

1

N

, in the classic selection problem the task is to find the

k

-th smallest element in the array. We study the complexity of this problem in the space-restricted random-access model: The input array is stored on read-only memory, and the algorithm has access to a limited amount of workspace. We prove that the linear-time prune-and-search algorithm---presented in most textbooks on algorithms---can be modified to use

\Theta(N)

bits instead of

\Theta(N)

words of extra space. Prior to our work, the best known algorithm by Frederickson could perform the task with

\Theta(N)

bits of extra space in

O(N \lg^{*} N)

time. Our result separates the space-restricted random-access model and the multi-pass streaming model, since we can surpass the

\Omega(N \lg^{*} N)

lower bound known for the latter model. We also generalize our algorithm for the case when the size of the workspace is

\Theta(S)

bits, where

\lg^3{N} \leq S \leq N

. The running time of our generalized algorithm is

O(N \lg^{*}(N/S) + N (\lg N) / \lg{} S)

, slightly improving over the

O(N \lg^{*}(N (\lg N)/S) + N (\lg N) / \lg{} S)

bound of Frederickson's algorithm. To obtain the improvements mentioned above, we developed a new data structure, called the wavelet stack, that we use for repeated pruning. We expect the wavelet stack to be a useful tool in other applications as well.Comment: 16 pages, 1 figure, Preliminary version appeared in COCOON-201

arXiv.org e-Print Archive

Crossref

Copenhagen University Research Information System

Memory-Adjustable Navigation Piles with Applications to Sorting and Convex Hulls

Author: Darwish Omar
Elmasry Amr
Katajainen Jyrki
Publication venue
Publication date: 24/10/2015
Field of study

We consider space-bounded computations on a random-access machine (RAM) where the input is given on a read-only random-access medium, the output is to be produced to a write-only sequential-access medium, and the available workspace allows random reads and writes but is of limited capacity. The length of the input is

N

elements, the length of the output is limited by the computation, and the capacity of the workspace is

O(S)

bits for some predetermined parameter

S

. We present a state-of-the-art priority queue---called an adjustable navigation pile---for this restricted RAM model. Under some reasonable assumptions, our priority queue supports

\mathit{minimum}

and

\mathit{insert}

O(1)

worst-case time and

\mathit{extract}

O(N/S + \lg{} S)

worst-case time for any

S \geq \lg{} N

. We show how to use this data structure to sort

N

elements and to compute the convex hull of

N

points in the two-dimensional Euclidean space in

O(N^2/S + N \lg{} S)

worst-case time for any

S \geq \lg{} N

. Following a known lower bound for the space-time product of any branching program for finding unique elements, both our sorting and convex-hull algorithms are optimal. The adjustable navigation pile has turned out to be useful when designing other space-efficient algorithms, and we expect that it will find its way to yet other applications.Comment: 21 page

arXiv.org e-Print Archive

MPG.PuRe

A Space-Optimal Grammar Compression

Author: I Tomohiro
Sakamoto Hiroshi
Takabatake Yoshimasa
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 25th Annual European Symposium on Algorithms (ESA 2017)
Publication date: 01/01/2017
Field of study

A grammar compression is a context-free grammar (CFG) deriving a single string deterministically. For an input string of length N over an alphabet of size sigma, the smallest CFG is O(log N)-approximable in the offline setting and O(log N log^* N)-approximable in the online setting. In addition, an information-theoretic lower bound for representing a CFG in Chomsky normal form of n variables is log (n!/n^sigma) + n + o(n) bits. Although there is an online grammar compression algorithm that directly computes the succinct encoding of its output CFG with O(log N log^* N) approximation guarantee, the problem of optimizing its working space has remained open. We propose a fully-online algorithm that requires the fewest bits of working space asymptotically equal to the lower bound in O(N log log n) compression time. In addition we propose several techniques to boost grammar compression and show their efficiency by computational experiments

Dagstuhl Research Online Publication Server

Quicksort Is Optimal For Many Equal Keys

Author: Wild Sebastian
Publication venue: 'Society for Industrial & Applied Mathematics (SIAM)'
Publication date: 17/08/2016
Field of study

I prove that the average number of comparisons for median-of-

k

Quicksort (with fat-pivot a.k.a. three-way partitioning) is asymptotically only a constant

\alpha_k

times worse than the lower bound for sorting random multisets with

\Omega(n^\varepsilon)

duplicates of each value (for any

\varepsilon>0

). The constant is

\alpha_k = \ln(2) / \bigl(H_{k+1}-H_{(k+1)/2} \bigr)

, which converges to 1 as

k\to\infty

, so Quicksort is asymptotically optimal for inputs with many duplicates. This resolves a conjecture by Sedgewick and Bentley (1999, 2002) and constitutes the first progress on the analysis of Quicksort with equal elements since Sedgewick's 1977 article

arXiv.org e-Print Archive

University of Liverpool Repository

Crossref