2,003 research outputs found

    More Efficient Algorithms and Analyses for Unequal Letter Cost Prefix-Free Coding

    Full text link
    There is a large literature devoted to the problem of finding an optimal (min-cost) prefix-free code with an unequal letter-cost encoding alphabet of size. While there is no known polynomial time algorithm for solving it optimally there are many good heuristics that all provide additive errors to optimal. The additive error in these algorithms usually depends linearly upon the largest encoding letter size. This paper was motivated by the problem of finding optimal codes when the encoding alphabet is infinite. Because the largest letter cost is infinite, the previous analyses could give infinite error bounds. We provide a new algorithm that works with infinite encoding alphabets. When restricted to the finite alphabet case, our algorithm often provides better error bounds than the best previous ones known.Comment: 29 pages;9 figures

    Optimal Prefix Codes for Infinite Alphabets with Nonlinear Costs

    Full text link
    Let P={p(i)}P = \{p(i)\} be a measure of strictly positive probabilities on the set of nonnegative integers. Although the countable number of inputs prevents usage of the Huffman algorithm, there are nontrivial PP for which known methods find a source code that is optimal in the sense of minimizing expected codeword length. For some applications, however, a source code should instead minimize one of a family of nonlinear objective functions, β\beta-exponential means, those of the form logaip(i)an(i)\log_a \sum_i p(i) a^{n(i)}, where n(i)n(i) is the length of the iith codeword and aa is a positive constant. Applications of such minimizations include a novel problem of maximizing the chance of message receipt in single-shot communications (a<1a<1) and a previously known problem of minimizing the chance of buffer overflow in a queueing system (a>1a>1). This paper introduces methods for finding codes optimal for such exponential means. One method applies to geometric distributions, while another applies to distributions with lighter tails. The latter algorithm is applied to Poisson distributions and both are extended to alphabetic codes, as well as to minimizing maximum pointwise redundancy. The aforementioned application of minimizing the chance of buffer overflow is also considered.Comment: 14 pages, 6 figures, accepted to IEEE Trans. Inform. Theor

    Infinite anti-uniform sources

    Get PDF
    6 pagesInternational audienceIn this paper we consider the class of anti-uniform Huffman (AUH) codes for sources with infinite alphabet. Poisson, negative binomial, geometric and exponential distributions lead to infinite anti-uniform sources for some ranges of their parameters. Huffman coding of these sources results in AUH codes. We prove that as a result of this encoding, we obtain sources with memory. For these sources we attach the graph and derive the transition matrix between states, the state probabilities and the entropy. If c0 and c1 denote the costs for storing or transmission of symbols "0" and "1", respectively, we compute the average cost for these AUH codes

    Universal Indexes for Highly Repetitive Document Collections

    Get PDF
    Indexing highly repetitive collections has become a relevant problem with the emergence of large repositories of versioned documents, among other applications. These collections may reach huge sizes, but are formed mostly of documents that are near-copies of others. Traditional techniques for indexing these collections fail to properly exploit their regularities in order to reduce space. We introduce new techniques for compressing inverted indexes that exploit this near-copy regularity. They are based on run-length, Lempel-Ziv, or grammar compression of the differential inverted lists, instead of the usual practice of gap-encoding them. We show that, in this highly repetitive setting, our compression methods significantly reduce the space obtained with classical techniques, at the price of moderate slowdowns. Moreover, our best methods are universal, that is, they do not need to know the versioning structure of the collection, nor that a clear versioning structure even exists. We also introduce compressed self-indexes in the comparison. These are designed for general strings (not only natural language texts) and represent the text collection plus the index structure (not an inverted index) in integrated form. We show that these techniques can compress much further, using a small fraction of the space required by our new inverted indexes. Yet, they are orders of magnitude slower.Comment: This research has received funding from the European Union's Horizon 2020 research and innovation programme under the Marie Sk{\l}odowska-Curie Actions H2020-MSCA-RISE-2015 BIRDS GA No. 69094

    Some basic properties of fix-free codes.

    Get PDF
    by Chunxuan Ye.Thesis (M.Phil.)--Chinese University of Hong Kong, 2000.Includes bibliographical references (leaves 74-[78]).Abstracts in English and Chinese.Chapter 1 --- Introduction --- p.1Chapter 1.1 --- Information Theory --- p.1Chapter 1.2 --- Source Coding --- p.2Chapter 1.3 --- Fixed Length Codes and Variable Length Codes --- p.4Chapter 1.4 --- Prefix Codes --- p.5Chapter 1.4.1 --- Kraft Inequality --- p.7Chapter 1.4.2 --- Huffman Coding --- p.9Chapter 2 --- Existence of Fix-Free Codes --- p.13Chapter 2.1 --- Introduction --- p.13Chapter 2.2 --- Previous Results --- p.14Chapter 2.2.1 --- Complete Fix-Free Codes --- p.14Chapter 2.2.2 --- Ahlswede's Results --- p.16Chapter 2.3 --- Two Properties of Fix-Free Codes --- p.17Chapter 2.4 --- A Sufficient Condition --- p.20Chapter 2.5 --- Other Sufficient Conditions --- p.33Chapter 2.6 --- A Necessary Condition --- p.37Chapter 2.7 --- A Necessary and Sufficient Condition --- p.42Chapter 3 --- Redundancy of Optimal Fix-Free Codes --- p.44Chapter 3.1 --- Introduction --- p.44Chapter 3.2 --- An Upper Bound in Terms of q --- p.46Chapter 3.3 --- An Upper Bound in Terms of p1 --- p.48Chapter 3.4 --- An Upper Bound in Terms of pn --- p.51Chapter 4 --- Two Applications of the Probabilistic Method --- p.54Chapter 4.1 --- An Alternative Proof for the Kraft Inequality --- p.54Chapter 4.2 --- A Characteristic Inequality for ´ب1´ة-ended Codes --- p.59Chapter 5 --- Summary and Future Work --- p.69Appendix --- p.71A Length Assignment for Upper Bounding the Redundancy of Fix-Free Codes --- p.71Bibliography --- p.7

    WIMAX TESTBED

    Get PDF
    WiMAX, the Worldwide Interoperability for Microwave Access, is a telecommunications technology aimed at providing wireless data over long distances in a variety of ways, from point-to-point links to full mobile cellular type access. It is based on the IEEE 802.16 standard, which is also called Wire IessMAN. The name WiMAX was created by the WiMAX Forum, which was formed in June 2001 to promote conformance and interoperability of the standard. The forum describes WiMAX as a standards-based technology enabling the delivery of last mile wireless broadband access as an alternative to cable and DSL. This Final Year Project attempts to simulate via Simulink, the working mechanism of a WiMAX testbed that includes a transmitter, channel and receiver. This undertaking will involve the baseband physical radio link. Rayleigh channel model together with frequency and timing offsets are introduced to the system and a blind receiver will attempt to correct these offsets and provide channel equalization. The testbed will use the Double Sliding Window for timing offset synchronization and the Schmid! & Cox algorithm for Fractional Frequency Offset estimation. The Integer Frequency Offset synchronization is achieved via correlation of the incoming preamble with its local copy whereas Residual Carrier Fr~quency Offset is estimated using the L th extension method. A linear Channel Estimator is added and combined with all the other blocks to form the testbed. From the results, this testbed matches the standard requirements for the BER when SNR is 18dB or higher. At these SNRs, the receiver side of the testbed is successful in performing the required synchronization and obtaining the same data sent. Sending data with SNR lower than 18dB compromises its performance as the channel equalizer is non-linear. This project also takes the first few steps of hardware implementation by using Real Time Workshop to convert the Simulink model into C codes which run outside MATLAB. In addition, the Double Sliding Window and Schmid! & Cox blocks are converted to Xilinx blocks and proven to be working like their Simulink counterparts
    corecore