77,668 research outputs found

    Universal Lossless Compression with Unknown Alphabets - The Average Case

    Full text link
    Universal compression of patterns of sequences generated by independently identically distributed (i.i.d.) sources with unknown, possibly large, alphabets is investigated. A pattern is a sequence of indices that contains all consecutive indices in increasing order of first occurrence. If the alphabet of a source that generated a sequence is unknown, the inevitable cost of coding the unknown alphabet symbols can be exploited to create the pattern of the sequence. This pattern can in turn be compressed by itself. It is shown that if the alphabet size kk is essentially small, then the average minimax and maximin redundancies as well as the redundancy of every code for almost every source, when compressing a pattern, consist of at least 0.5 log(n/k^3) bits per each unknown probability parameter, and if all alphabet letters are likely to occur, there exist codes whose redundancy is at most 0.5 log(n/k^2) bits per each unknown probability parameter, where n is the length of the data sequences. Otherwise, if the alphabet is large, these redundancies are essentially at least O(n^{-2/3}) bits per symbol, and there exist codes that achieve redundancy of essentially O(n^{-1/2}) bits per symbol. Two sub-optimal low-complexity sequential algorithms for compression of patterns are presented and their description lengths analyzed, also pointing out that the pattern average universal description length can decrease below the underlying i.i.d.\ entropy for large enough alphabets.Comment: Revised for IEEE Transactions on Information Theor

    Connectivity Compression for Irregular Quadrilateral Meshes

    Get PDF
    Applications that require Internet access to remote 3D datasets are often limited by the storage costs of 3D models. Several compression methods are available to address these limits for objects represented by triangle meshes. Many CAD and VRML models, however, are represented as quadrilateral meshes or mixed triangle/quadrilateral meshes, and these models may also require compression. We present an algorithm for encoding the connectivity of such quadrilateral meshes, and we demonstrate that by preserving and exploiting the original quad structure, our approach achieves encodings 30 - 80% smaller than an approach based on randomly splitting quads into triangles. We present both a code with a proven worst-case cost of 3 bits per vertex (or 2.75 bits per vertex for meshes without valence-two vertices) and entropy-coding results for typical meshes ranging from 0.3 to 0.9 bits per vertex, depending on the regularity of the mesh. Our method may be implemented by a rule for a particular splitting of quads into triangles and by using the compression and decompression algorithms introduced in [Rossignac99] and [Rossignac&Szymczak99]. We also present extensions to the algorithm to compress meshes with holes and handles and meshes containing triangles and other polygons as well as quads

    Capacity and Random-Coding Exponents for Channel Coding with Side Information

    Full text link
    Capacity formulas and random-coding exponents are derived for a generalized family of Gel'fand-Pinsker coding problems. These exponents yield asymptotic upper bounds on the achievable log probability of error. In our model, information is to be reliably transmitted through a noisy channel with finite input and output alphabets and random state sequence, and the channel is selected by a hypothetical adversary. Partial information about the state sequence is available to the encoder, adversary, and decoder. The design of the transmitter is subject to a cost constraint. Two families of channels are considered: 1) compound discrete memoryless channels (CDMC), and 2) channels with arbitrary memory, subject to an additive cost constraint, or more generally to a hard constraint on the conditional type of the channel output given the input. Both problems are closely connected. The random-coding exponent is achieved using a stacked binning scheme and a maximum penalized mutual information decoder, which may be thought of as an empirical generalized Maximum a Posteriori decoder. For channels with arbitrary memory, the random-coding exponents are larger than their CDMC counterparts. Applications of this study include watermarking, data hiding, communication in presence of partially known interferers, and problems such as broadcast channels, all of which involve the fundamental idea of binning.Comment: to appear in IEEE Transactions on Information Theory, without Appendices G and

    Mathematical Programming Decoding of Binary Linear Codes: Theory and Algorithms

    Full text link
    Mathematical programming is a branch of applied mathematics and has recently been used to derive new decoding approaches, challenging established but often heuristic algorithms based on iterative message passing. Concepts from mathematical programming used in the context of decoding include linear, integer, and nonlinear programming, network flows, notions of duality as well as matroid and polyhedral theory. This survey article reviews and categorizes decoding methods based on mathematical programming approaches for binary linear codes over binary-input memoryless symmetric channels.Comment: 17 pages, submitted to the IEEE Transactions on Information Theory. Published July 201
    • …
    corecore