Search CORE

3,094 research outputs found

Optimal Prefix Codes with Fewer Distinct Codeword Lengths are Faster to Construct

Author: Belal Ahmed
Elmasry Amr
Publication venue
Publication date: 16/08/2016
Field of study

A new method for constructing minimum-redundancy binary prefix codes is described. Our method does not explicitly build a Huffman tree; instead it uses a property of optimal prefix codes to compute the codeword lengths corresponding to the input weights. Let

n

be the number of weights and

k

be the number of distinct codeword lengths as produced by the algorithm for the optimum codes. The running time of our algorithm is

O(k \cdot n)

. Following our previous work in \cite{be}, no algorithm can possibly construct optimal prefix codes in

o(k \cdot n)

time. When the given weights are presorted our algorithm performs

O(9^k \cdot \log^{2k}{n})

comparisons.Comment: 23 pages, a preliminary version appeared in STACS 200

arXiv.org e-Print Archive

CiteSeerX

Lower Bounds on the Redundancy of Huffman Codes with Known and Unknown Probabilities

Author: Blanes Ian
Hernández-Cabronero Miguel
Marcellin Michael W.
Serra-Sagristà Joan
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 31/07/2019
Field of study

In this paper we provide a method to obtain tight lower bounds on the minimum redundancy achievable by a Huffman code when the probability distribution underlying an alphabet is only partially known. In particular, we address the case where the occurrence probabilities are unknown for some of the symbols in an alphabet. Bounds can be obtained for alphabets of a given size, for alphabets of up to a given size, and for alphabets of arbitrary size. The method operates on a Computer Algebra System, yielding closed-form numbers for all results. Finally, we show the potential of the proposed method to shed some light on the structure of the minimum redundancy achievable by the Huffman code

arXiv.org e-Print Archive

The University of Arizona

Optimal prefix codes for pairs of geometrically-distributed random variables

Author: Bassino Frédérique
Clément Julien
Seroussi Gadiel
Viola Alfredo
Publication venue
Publication date: 06/01/2013
Field of study

Optimal prefix codes are studied for pairs of independent, integer-valued symbols emitted by a source with a geometric probability distribution of parameter

q

0{<}q{<}1

. By encoding pairs of symbols, it is possible to reduce the redundancy penalty of symbol-by-symbol encoding, while preserving the simplicity of the encoding and decoding procedures typical of Golomb codes and their variants. It is shown that optimal codes for these so-called two-dimensional geometric distributions are \emph{singular}, in the sense that a prefix code that is optimal for one value of the parameter

q

cannot be optimal for any other value of

q

. This is in sharp contrast to the one-dimensional case, where codes are optimal for positive-length intervals of the parameter

q

. Thus, in the two-dimensional case, it is infeasible to give a compact characterization of optimal codes for all values of the parameter

q

, as was done in the one-dimensional case. Instead, optimal codes are characterized for a discrete sequence of values of

q

that provide good coverage of the unit interval. Specifically, optimal prefix codes are described for

q=2^{-1/k}

(

k\ge 1

), covering the range

q\ge 1/2

, and

q=2^{-k}

(

k>1

), covering the range

q<1/2

. The described codes produce the expected reduction in redundancy with respect to the one-dimensional case, while maintaining low complexity coding operations.Comment: To appear in IEEE Transactions on Information Theor

arXiv.org e-Print Archive

HAL - Normandie Université

HAL-Paris 13

An implementation of Deflate in Coq

Author: AW Appel
B McMillan
D Huffman
G Klein
J Kelsey
JC Blanchette
NA Danielsson
R Affeldt
R Affeldt
V Vafeiadis
Publication venue
Publication date: 05/09/2016
Field of study

The widely-used compression format "Deflate" is defined in RFC 1951 and is based on prefix-free codings and backreferences. There are unclear points about the way these codings are specified, and several sources for confusion in the standard. We tried to fix this problem by giving a rigorous mathematical specification, which we formalized in Coq. We produced a verified implementation in Coq which achieves competitive performance on inputs of several megabytes. In this paper we present the several parts of our implementation: a fully verified implementation of canonical prefix-free codings, which can be used in other compression formats as well, and an elegant formalism for specifying sophisticated formats, which we used to implement both a compression and decompression algorithm in Coq which we formally prove inverse to each other -- the first time this has been achieved to our knowledge. The compatibility to other Deflate implementations can be shown empirically. We furthermore discuss some of the difficulties, specifically regarding memory and runtime requirements, and our approaches to overcome them

arXiv.org e-Print Archive

Crossref

Polar codes with a stepped boundary

Author: Dumer Ilya
Publication venue
Publication date: 16/02/2017
Field of study

We consider explicit polar constructions of blocklength

n\rightarrow\infty

for the two extreme cases of code rates

R\rightarrow1

and

R\rightarrow0.

For code rates

R\rightarrow1,

we design codes with complexity order of

n\log n

in code construction, encoding, and decoding. These codes achieve the vanishing output bit error rates on the binary symmetric channels with any transition error probability

p\rightarrow 0

and perform this task with a substantially smaller redundancy

(1-R)n

than do other known high-rate codes, such as BCH codes or Reed-Muller (RM). We then extend our design to the low-rate codes that achieve the vanishing output error rates with the same complexity order of

n\log n

and an asymptotically optimal code rate

R\rightarrow0

for the case of

p\rightarrow1/2.

Comment: This article has been submitted to ISIT 201

arXiv.org e-Print Archive

Crossref

Optimal Prefix Free Codes with Partial Sorting

Author
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 27th Annual Symposium on Combinatorial Pattern Matching (CPM 2016)
Publication date: 01/01/2016
Field of study

We describe an algorithm computing an optimal prefix free code for n unsorted positive weights in less time than required to sort them on many large classes of instances, identified by a new measure of difficulty for this problem, the alternation alpha. This asymptotical complexity is within a constant factor of the optimal in the algebraic decision tree computational model, in the worst case over all instances of fixed size n and alternation alpha. Such results refine the state of the art complexity in the worst case over instances of size n in the same computational model, a landmark in compression and coding since 1952, by the mere combination of van Leeuwen\u27s algorithm to compute optimal prefix free codes from sorted weights (known since 1976), with Deferred Data Structures to partially sort multisets (known since 1988)

Dagstuhl Research Online Publication Server

Reliable Memory Storage by Natural Redundancy

Author: Mink Jacob V
Publication venue
Publication date: 23/05/2018
Field of study

Non-volatile memories are becoming the dominant type of storage devices in modern computers because of their fast speed, physical robustness and high data density. However, there still exist many challenges, such as the data reliability issues due to noise. An important example is the memristor, which uses programmable resistance to store data. Memristor memories use the crossbar architecture and suffer from the sneak-path problem: when a memristor cell of high resistance is read, it can be mistakenly read as a low-resistance cell due to low-resistance sneak-paths in the crossbar that are parallel to the cell. In this work, we study new ways to correct errors using the inherent redundancy in stored data (called Natural Redundancy), and combine them with conventional error-correcting codes. In particular, we define a Huffman encoding for the English language based on a repository of books. In addition, we study data stored using convolutional codes and use natural redundancy to verify if decoded codewords are valid or invalid. We present statistics over the Viterbi Algorithm and its ability to decode convolutional codewords, then discuss Yen's Algorithm, an augmentation of the Viterbi Algorithm. Finally, we present an efficient algorithm to search for a list of the most likely codewords, and choose a codeword that meets the criteria of both natural redundancy and the ECC as the decoding solution. We find that this algorithm is no more powerful than Yen's Algorithm in terms of decoding noisy convolutional codewords, but does present some interesting ideas for further exploration across multiple fields of study

Texas A&M Repository