Search CORE

9,255 research outputs found

New Algorithms and Lower Bounds for Sequential-Access Data Compression

Author: Gagie Travis
Publication venue
Publication date: 01/01/2009
Field of study

This thesis concerns sequential-access data compression, i.e., by algorithms that read the input one or more times from beginning to end. In one chapter we consider adaptive prefix coding, for which we must read the input character by character, outputting each character's self-delimiting codeword before reading the next one. We show how to encode and decode each character in constant worst-case time while producing an encoding whose length is worst-case optimal. In another chapter we consider one-pass compression with memory bounded in terms of the alphabet size and context length, and prove a nearly tight tradeoff between the amount of memory we can use and the quality of the compression we can achieve. In a third chapter we consider compression in the read/write streams model, which allows us passes and memory both polylogarithmic in the size of the input. We first show how to achieve universal compression using only one pass over one stream. We then show that one stream is not sufficient for achieving good grammar-based compression. Finally, we show that two streams are necessary and sufficient for achieving entropy-only bounds.Comment: draft of PhD thesi

arXiv.org e-Print Archive

Publications at Bielefeld University

Recommended from our members

Parallel data compression

Author: Hirschberg Daniel S.
Stauffer Lynn M.
Publication venue: eScholarship, University of California
Publication date: 01/05/1991
Field of study

Data compression schemes remove data redundancy in communicated and stored data and increase the effective capacities of communication and storage devices. Parallel algorithms and implementations for textual data compression are surveyed. Related concepts from parallel computation and information theory are briefly discussed. Static and dynamic methods for codeword construction and transmission on various models of parallel computation are described. Included are parallel methods which boost system speed by coding data concurrently, and approaches which employ multiple compression techniques to improve compression ratios. Theoretical and empirical comparisons are reported and areas for future research are suggested

eScholarship - University of California

Streamability of nested word transductions

Author: Filiot Emmanuel
Gauwin Olivier
Reynier Pierre-Alain
Servais Frédéric
Publication venue
Publication date: 01/01/2019
Field of study

We consider the problem of evaluating in streaming (i.e., in a single left-to-right pass) a nested word transduction with a limited amount of memory. A transduction T is said to be height bounded memory (HBM) if it can be evaluated with a memory that depends only on the size of T and on the height of the input word. We show that it is decidable in coNPTime for a nested word transduction defined by a visibly pushdown transducer (VPT), if it is HBM. In this case, the required amount of memory may depend exponentially on the height of the word. We exhibit a sufficient, decidable condition for a VPT to be evaluated with a memory that depends quadratically on the height of the word. This condition defines a class of transductions that strictly contains all determinizable VPTs

arXiv.org e-Print Archive

HAL AMU

Episciences.org

DI-fusion

Universal lossless source coding with the Burrows Wheeler transform

Author: Effros Michelle
Kulkarni Sanjeev R.
Verdú Sergio
Visweswariah Karthik
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2002
Field of study

The Burrows Wheeler transform (1994) is a reversible sequence transformation used in a variety of practical lossless source-coding algorithms. In each, the BWT is followed by a lossless source code that attempts to exploit the natural ordering of the BWT coefficients. BWT-based compression schemes are widely touted as low-complexity algorithms giving lossless coding rates better than those of the Ziv-Lempel codes (commonly known as LZ'77 and LZ'78) and almost as good as those achieved by prediction by partial matching (PPM) algorithms. To date, the coding performance claims have been made primarily on the basis of experimental results. This work gives a theoretical evaluation of BWT-based coding. The main results of this theoretical evaluation include: (1) statistical characterizations of the BWT output on both finite strings and sequences of length n → ∞, (2) a variety of very simple new techniques for BWT-based lossless source coding, and (3) proofs of the universality and bounds on the rates of convergence of both new and existing BWT-based codes for finite-memory and stationary ergodic sources. The end result is a theoretical justification and validation of the experimentally derived conclusions: BWT-based lossless source codes achieve universal lossless coding performance that converges to the optimal coding performance more quickly than the rate of convergence observed in Ziv-Lempel style codes and, for some BWT-based codes, within a constant factor of the optimal rate of convergence for finite-memory source

CiteSeerX

Caltech Authors

Decompositional state assignment with reuse of standard designs : using counters as sub-machines and using the method of maximal adjacencies to select the state chains and the state codes

Author: Jozwiak L.
Kwaaitaal-Spassova T.G.
Publication venue: Technische Universiteit Eindhoven
Publication date: 01/01/1990
Field of study

Repository TU/e

Pure OAI Repository

Computing with Coloured Tangles

Author: Carmi Avishy Y.
Moskovich Daniel
Publication venue: 'MDPI AG'
Publication date: 01/07/2015
Field of study

We suggest a diagrammatic model of computation based on an axiom of distributivity. A diagram of a decorated coloured tangle, similar to those that appear in low dimensional topology, plays the role of a circuit diagram. Equivalent diagrams represent bisimilar computations. We prove that our model of computation is Turing complete, and that with bounded resources it can moreover decide any language in complexity class IP, sometimes with better performance parameters than corresponding classical protocols.Comment: 36 pages,; Introduction entirely rewritten, Section 4.3 adde

arXiv.org e-Print Archive

Multidisciplinary Digital Publishing Institute

Directory of Open Access Journals

On the operating unit size of load/store architectures

Author: Bergstra
Bergstra
C. A. MIDDELBURG
Hennessy
Hoare
Hopcroft
J. A. BERGSTRA
Milner
Thornton
Publication venue: 'Cambridge University Press (CUP)'
Publication date: 01/01/2007
Field of study

We introduce a strict version of the concept of a load/store instruction set architecture in the setting of Maurer machines. We take the view that transformations on the states of a Maurer machine are achieved by applying threads as considered in thread algebra to the Maurer machine. We study how the transformations on the states of the main memory of a strict load/store instruction set architecture that can be achieved by applying threads depend on the operating unit size, the cardinality of the instruction set, and the maximal number of states of the threads.Comment: 23 pages; minor errors corrected, explanations added, references replace

arXiv.org e-Print Archive

CiteSeerX

Crossref

UvA-DARE

International Migration, Integration and Social Cohesion online publications