Search CORE

6,109 research outputs found

Minimum Redundancy Coding for Uncertain Sources

Author: Baer Michael B.
Charalambous Charalambos D.
Rezaei Farzad
Publication venue
Publication date: 01/05/2011
Field of study

Consider the set of source distributions within a fixed maximum relative entropy with respect to a given nominal distribution. Lossless source coding over this relative entropy ball can be approached in more than one way. A problem previously considered is finding a minimax average length source code. The minimizing players are the codeword lengths --- real numbers for arithmetic codes, integers for prefix codes --- while the maximizing players are the uncertain source distributions. Another traditional minimizing objective is the first one considered here, maximum (average) redundancy. This problem reduces to an extension of an exponential Huffman objective treated in the literature but heretofore without direct practical application. In addition to these, this paper examines the related problem of maximal minimax pointwise redundancy and the problem considered by Gawrychowski and Gagie, which, for a sufficiently small relative entropy ball, is equivalent to minimax redundancy. One can consider both Shannon-like coding based on optimal real number ("ideal") codeword lengths and a Huffman-like optimal prefix coding.Comment: 5 page

arXiv.org e-Print Archive

Optimal Prefix Codes for Infinite Alphabets with Nonlinear Costs

Author: Baer Michael B.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 25/11/2007
Field of study

Let

P = \{p(i)\}

be a measure of strictly positive probabilities on the set of nonnegative integers. Although the countable number of inputs prevents usage of the Huffman algorithm, there are nontrivial

P

for which known methods find a source code that is optimal in the sense of minimizing expected codeword length. For some applications, however, a source code should instead minimize one of a family of nonlinear objective functions,

\beta

-exponential means, those of the form

\log_a \sum_i p(i) a^{n(i)}

, where

n(i)

is the length of the

i

th codeword and

a

is a positive constant. Applications of such minimizations include a novel problem of maximizing the chance of message receipt in single-shot communications (

a<1

) and a previously known problem of minimizing the chance of buffer overflow in a queueing system (

a>1

). This paper introduces methods for finding codes optimal for such exponential means. One method applies to geometric distributions, while another applies to distributions with lighter tails. The latter algorithm is applied to Poisson distributions and both are extended to alphabetic codes, as well as to minimizing maximum pointwise redundancy. The aforementioned application of minimizing the chance of buffer overflow is also considered.Comment: 14 pages, 6 figures, accepted to IEEE Trans. Inform. Theor

arXiv.org e-Print Archive

CiteSeerX

Optimal Merging Algorithms for Lossless Codes with Generalized Criteria

Author: Charalambous Charalambos D.
Charalambous Themistoklis
Rezaei Farzad
Publication venue
Publication date: 06/08/2012
Field of study

This paper presents lossless prefix codes optimized with respect to a pay-off criterion consisting of a convex combination of maximum codeword length and average codeword length. The optimal codeword lengths obtained are based on a new coding algorithm which transforms the initial source probability vector into a new probability vector according to a merging rule. The coding algorithm is equivalent to a partition of the source alphabet into disjoint sets on which a new transformed probability vector is defined as a function of the initial source probability vector and a scalar parameter. The pay-off criterion considered encompasses a trade-off between maximum and average codeword length; it is related to a pay-off criterion consisting of a convex combination of average codeword length and average of an exponential function of the codeword length, and to an average codeword length pay-off criterion subject to a limited length constraint. A special case of the first related pay-off is connected to coding problems involving source probability uncertainty and codeword overflow probability, while the second related pay-off compliments limited length Huffman coding algorithms.Comment: 40 pages long, arXiv admin note: text overlap with arXiv:1102.2207, arXiv:1202.013

arXiv.org e-Print Archive

Nonasymptotic coding-rate bounds for binary erasure channels with feedback

Author: Dalai Marco
Devassy Rahul
Durisi Giuseppe
Lindquist Benjamin
Yang Wei
Publication venue
Publication date: 01/01/2016
Field of study

We present nonasymptotic achievability and converse bounds on the maximum coding rate (for a fixed average error probability and a fixed average blocklength) of variable-length full-feedback (VLF) and variable-length stop-feedback (VLSF) codes operating over a binary erasure channel (BEC). For the VLF setup, the achievability bound relies on a scheme that maps each message onto a variable-length Huffman codeword and then repeats each bit of the codeword until it is received correctly. The converse bound is inspired by the meta-converse framework by Polyanskiy, Poor, and Verdú (2010) and relies on binary sequential hypothesis testing. For the case of zero error probability, our achievability and converse bounds match. For the VLSF case, we provide achievability bounds that exploit the following feature of BEC: the decoder can assess the correctness of its estimate by verifying whether the chosen codeword is the only one that is compatible with the erasure pattern. One of these bounds is obtained by analyzing the performance of a variable-length extension of random linear fountain codes. The gap between the VLSF achievability and the VLF converse bound, when number of messages is small, is significant: 23% for 8 messages on a BEC with erasure probability 0.5. The absence of a tight VLSF converse bound does not allow us to assess whether this gap is fundamental

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Università di Brescia

Chalmers Research

Chalmers Publication Library

Minimum Delay Huffman Code in Backward Decoding Procedure

Author: Aghajan Adel
Poostindouz Alireza
Publication venue
Publication date: 24/11/2013
Field of study

For some applications where the speed of decoding and the fault tolerance are important, like in video storing, one of the successful answers is Fix-Free Codes. These codes have been applied in some standards like H.263+ and MPEG-4. The cost of using fix-free codes is to increase the redundancy of the code which means the increase in the amount of bits we need to represent any peace of information. Thus we investigated the use of Huffman Codes with low and negligible backward decoding delay. We showed that for almost all cases there is always a Minimum Delay Huffman Code for a given length vector. The average delay of this code for anti-uniform sources is calculated, that is in agreement with the simulations, and it is shown that this delay is one bit for large alphabet sources. Also an algorithm is proposed which will find the minimum delay code with a good performance

arXiv.org e-Print Archive

Probability Mass Functions for which Sources have the Maximum Minimum Expected Length

Author: Manickam Shivkumar K.
Publication venue
Publication date: 09/03/2019
Field of study

Let

\mathcal{P}_n

be the set of all probability mass functions (PMFs)

(p_1,p_2,\ldots,p_n)

that satisfy

p_i>0

for

1\leq i \leq n

. Define the minimum expected length function

\mathcal{L}_D :\mathcal{P}_n \rightarrow \mathbb{R}

such that

\mathcal{L}_D (P)

is the minimum expected length of a prefix code, formed out of an alphabet of size

D

, for the discrete memoryless source having

P

as its source distribution. It is well-known that the function

\mathcal{L}_D

attains its maximum value at the uniform distribution. Further, when

n

is of the form

D^m

, with

m

being a positive integer, PMFs other than the uniform distribution at which

\mathcal{L}_D

attains its maximum value are known. However, a complete characterization of all such PMFs at which the minimum expected length function attains its maximum value has not been done so far. This is done in this paper

arXiv.org e-Print Archive

Tight Bounds on the Average Length, Entropy, and Redundancy of Anti-Uniform Huffman Codes

Author: Kakhbod Ali
Mohajer Soheil
Publication venue
Publication date: 23/06/2007
Field of study

In this paper we consider the class of anti-uniform Huffman codes and derive tight lower and upper bounds on the average length, entropy, and redundancy of such codes in terms of the alphabet size of the source. The Fibonacci distributions are introduced which play a fundamental role in AUH codes. It is shown that such distributions maximize the average length and the entropy of the code for a given alphabet size. Another previously known bound on the entropy for given average length follows immediately from our results.Comment: 9 pages, 2 figure

arXiv.org e-Print Archive

Efficient and Compact Representations of Prefix Codes

Author: Gagie Travis
Navarro Gonzalo
Nekrich Yakov
Ordóñez Alberto
Publication venue
Publication date: 29/06/2015
Field of study

Most of the attention in statistical compression is given to the space used by the compressed sequence, a problem completely solved with optimal prefix codes. However, in many applications, the storage space used to represent the prefix code itself can be an issue. In this paper we introduce and compare several techniques to store prefix codes. Let

N

be the sequence length and

n

be the alphabet size. Then a naive storage of an optimal prefix code uses

O(n\log n)

bits. Our first technique shows how to use

O(n\log\log(N/n))

bits to store the optimal prefix code. Then we introduce an approximate technique that, for any

0<\epsilon<1/2

, takes

O(n \log \log (1 / \epsilon))

bits to store a prefix code with average codeword length within an additive

\epsilon

of the minimum. Finally, a second approximation takes, for any constant

c > 1

O(n^{1 / c} \log n)

bits to store a prefix code with average codeword length at most

c

times the minimum. In all cases, our data structures allow encoding and decoding of any symbol in

O(1)

time. We experimentally compare our new techniques with the state of the art, showing that we achieve 6--8-fold space reductions, at the price of a slower encoding (2.5--8 times slower) and decoding (12--24 times slower). The approximations further reduce this space and improve the time significantly, up to recovering the speed of classical implementations, for a moderate penalty in the average code length. As a byproduct, we compare various heuristic, approximate, and optimal algorithms to generate length-restricted codes, showing that the optimal ones are clearly superior and practical enough to be implemented

arXiv.org e-Print Archive

The number of Huffman codes, compact trees, and sums of unit fractions

Author: Elsholtz Christian
Heuberger Clemens
Prodinger Helmut
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 30/08/2011
Field of study

The number of "nonequivalent" Huffman codes of length r over an alphabet of size t has been studied frequently. Equivalently, the number of "nonequivalent" complete t-ary trees has been examined. We first survey the literature, unifying several independent approaches to the problem. Then, improving on earlier work we prove a very precise asymptotic result on the counting function, consisting of two main terms and an error term

arXiv.org e-Print Archive

Efficient and Compact Representations of Some Non-Canonical Prefix-Free Codes

Author: Fariña Antonio
Gagie Travis
Manzini Giovanni
Navarro Gonzalo
Ordóñez Alberto
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 14/12/2016
Field of study

For many kinds of prefix-free codes there are efficient and compact alternatives to the traditional tree-based representation. Since these put the codes into canonical form, however, they can only be used when we can choose the order in which codewords are assigned to characters. In this paper we first show how, given a probability distribution over an alphabet of

\sigma

characters, we can store a nearly optimal alphabetic prefix-free code in

o (\sigma)

bits such that we can encode and decode any character in constant time. We then consider a kind of code introduced recently to reduce the space usage of wavelet matrices (Claude, Navarro, and Ord\'o\~nez, Information Systems, 2015). They showed how to build an optimal prefix-free code such that the codewords' lengths are non-decreasing when they are arranged such that their reverses are in lexicographic order. We show how to store such a code in

O (\sigma \log L + 2^{\epsilon L})

bits, where

L

is the maximum codeword length and

\epsilon

is any positive constant, such that we can encode and decode any character in constant time under reasonable assumptions. Otherwise, we can always encode and decode a codeword of

\ell

bits in time

O (\ell)

using

O (\sigma\log L)

bits of space.Comment: This research has received funding from the European Union's Horizon 2020 research and innovation programme under the Marie Sklodowska-Curie Actions H2020-MSCA-RISE-2015 BIRDS GA No. 69094

arXiv.org e-Print Archive