77,668 research outputs found
Universal Lossless Compression with Unknown Alphabets - The Average Case
Universal compression of patterns of sequences generated by independently
identically distributed (i.i.d.) sources with unknown, possibly large,
alphabets is investigated. A pattern is a sequence of indices that contains all
consecutive indices in increasing order of first occurrence. If the alphabet of
a source that generated a sequence is unknown, the inevitable cost of coding
the unknown alphabet symbols can be exploited to create the pattern of the
sequence. This pattern can in turn be compressed by itself. It is shown that if
the alphabet size is essentially small, then the average minimax and
maximin redundancies as well as the redundancy of every code for almost every
source, when compressing a pattern, consist of at least 0.5 log(n/k^3) bits per
each unknown probability parameter, and if all alphabet letters are likely to
occur, there exist codes whose redundancy is at most 0.5 log(n/k^2) bits per
each unknown probability parameter, where n is the length of the data
sequences. Otherwise, if the alphabet is large, these redundancies are
essentially at least O(n^{-2/3}) bits per symbol, and there exist codes that
achieve redundancy of essentially O(n^{-1/2}) bits per symbol. Two sub-optimal
low-complexity sequential algorithms for compression of patterns are presented
and their description lengths analyzed, also pointing out that the pattern
average universal description length can decrease below the underlying i.i.d.\
entropy for large enough alphabets.Comment: Revised for IEEE Transactions on Information Theor
Connectivity Compression for Irregular Quadrilateral Meshes
Applications that require Internet access to remote 3D datasets are often
limited by the storage costs of 3D models. Several compression methods are
available to address these limits for objects represented by triangle meshes.
Many CAD and VRML models, however, are represented as quadrilateral meshes or
mixed triangle/quadrilateral meshes, and these models may also require
compression. We present an algorithm for encoding the connectivity of such
quadrilateral meshes, and we demonstrate that by preserving and exploiting the
original quad structure, our approach achieves encodings 30 - 80% smaller than
an approach based on randomly splitting quads into triangles. We present both a
code with a proven worst-case cost of 3 bits per vertex (or 2.75 bits per
vertex for meshes without valence-two vertices) and entropy-coding results for
typical meshes ranging from 0.3 to 0.9 bits per vertex, depending on the
regularity of the mesh. Our method may be implemented by a rule for a
particular splitting of quads into triangles and by using the compression and
decompression algorithms introduced in [Rossignac99] and
[Rossignac&Szymczak99]. We also present extensions to the algorithm to compress
meshes with holes and handles and meshes containing triangles and other
polygons as well as quads
Capacity and Random-Coding Exponents for Channel Coding with Side Information
Capacity formulas and random-coding exponents are derived for a generalized
family of Gel'fand-Pinsker coding problems. These exponents yield asymptotic
upper bounds on the achievable log probability of error. In our model,
information is to be reliably transmitted through a noisy channel with finite
input and output alphabets and random state sequence, and the channel is
selected by a hypothetical adversary. Partial information about the state
sequence is available to the encoder, adversary, and decoder. The design of the
transmitter is subject to a cost constraint. Two families of channels are
considered: 1) compound discrete memoryless channels (CDMC), and 2) channels
with arbitrary memory, subject to an additive cost constraint, or more
generally to a hard constraint on the conditional type of the channel output
given the input. Both problems are closely connected. The random-coding
exponent is achieved using a stacked binning scheme and a maximum penalized
mutual information decoder, which may be thought of as an empirical generalized
Maximum a Posteriori decoder. For channels with arbitrary memory, the
random-coding exponents are larger than their CDMC counterparts. Applications
of this study include watermarking, data hiding, communication in presence of
partially known interferers, and problems such as broadcast channels, all of
which involve the fundamental idea of binning.Comment: to appear in IEEE Transactions on Information Theory, without
Appendices G and
Mathematical Programming Decoding of Binary Linear Codes: Theory and Algorithms
Mathematical programming is a branch of applied mathematics and has recently
been used to derive new decoding approaches, challenging established but often
heuristic algorithms based on iterative message passing. Concepts from
mathematical programming used in the context of decoding include linear,
integer, and nonlinear programming, network flows, notions of duality as well
as matroid and polyhedral theory. This survey article reviews and categorizes
decoding methods based on mathematical programming approaches for binary linear
codes over binary-input memoryless symmetric channels.Comment: 17 pages, submitted to the IEEE Transactions on Information Theory.
Published July 201
- …