1,010 research outputs found
Prefix Codes: Equiprobable Words, Unequal Letter Costs
Describes a near-linear-time algorithm for a variant of Huffman coding, in
which the letters may have non-uniform lengths (as in Morse code), but with the
restriction that each word to be encoded has equal probability. [See also
``Huffman Coding with Unequal Letter Costs'' (2002).]Comment: proceedings version in ICALP (1994
Dynamic Shannon Coding
We present a new algorithm for dynamic prefix-free coding, based on Shannon
coding. We give a simple analysis and prove a better upper bound on the length
of the encoding produced than the corresponding bound for dynamic Huffman
coding. We show how our algorithm can be modified for efficient
length-restricted coding, alphabetic coding and coding with unequal letter
costs.Comment: 6 pages; conference version presented at ESA 2004; journal version
submitted to IEEE Transactions on Information Theor
More Efficient Algorithms and Analyses for Unequal Letter Cost Prefix-Free Coding
There is a large literature devoted to the problem of finding an optimal
(min-cost) prefix-free code with an unequal letter-cost encoding alphabet of
size. While there is no known polynomial time algorithm for solving it
optimally there are many good heuristics that all provide additive errors to
optimal. The additive error in these algorithms usually depends linearly upon
the largest encoding letter size.
This paper was motivated by the problem of finding optimal codes when the
encoding alphabet is infinite. Because the largest letter cost is infinite, the
previous analyses could give infinite error bounds. We provide a new algorithm
that works with infinite encoding alphabets. When restricted to the finite
alphabet case, our algorithm often provides better error bounds than the best
previous ones known.Comment: 29 pages;9 figures
Infinite anti-uniform sources
6 pagesInternational audienceIn this paper we consider the class of anti-uniform Huffman (AUH) codes for sources with infinite alphabet. Poisson, negative binomial, geometric and exponential distributions lead to infinite anti-uniform sources for some ranges of their parameters. Huffman coding of these sources results in AUH codes. We prove that as a result of this encoding, we obtain sources with memory. For these sources we attach the graph and derive the transition matrix between states, the state probabilities and the entropy. If c0 and c1 denote the costs for storing or transmission of symbols "0" and "1", respectively, we compute the average cost for these AUH codes
Huffman Coding with Letter Costs: A Linear-Time Approximation Scheme
We give a polynomial-time approximation scheme for the generalization of
Huffman Coding in which codeword letters have non-uniform costs (as in Morse
code, where the dash is twice as long as the dot). The algorithm computes a
(1+epsilon)-approximate solution in time O(n + f(epsilon) log^3 n), where n is
the input size
First-Come-First-Served for Online Slot Allocation and Huffman Coding
Can one choose a good Huffman code on the fly, without knowing the underlying
distribution? Online Slot Allocation (OSA) models this and similar problems:
There are n slots, each with a known cost. There are n items. Requests for
items are drawn i.i.d. from a fixed but hidden probability distribution p.
After each request, if the item, i, was not previously requested, then the
algorithm (knowing the slot costs and the requests so far, but not p) must
place the item in some vacant slot j(i). The goal is to minimize the sum, over
the items, of the probability of the item times the cost of its assigned slot.
The optimal offline algorithm is trivial: put the most probable item in the
cheapest slot, the second most probable item in the second cheapest slot, etc.
The optimal online algorithm is First Come First Served (FCFS): put the first
requested item in the cheapest slot, the second (distinct) requested item in
the second cheapest slot, etc. The optimal competitive ratios for any online
algorithm are 1+H(n-1) ~ ln n for general costs and 2 for concave costs. For
logarithmic costs, the ratio is, asymptotically, 1: FCFS gives cost opt + O(log
opt).
For Huffman coding, FCFS yields an online algorithm (one that allocates
codewords on demand, without knowing the underlying probability distribution)
that guarantees asymptotically optimal cost: at most opt + 2 log(1+opt) + 2.Comment: ACM-SIAM Symposium on Discrete Algorithms (SODA) 201
- …