8,702 research outputs found
Universal Lossless Compression with Unknown Alphabets - The Average Case
Universal compression of patterns of sequences generated by independently
identically distributed (i.i.d.) sources with unknown, possibly large,
alphabets is investigated. A pattern is a sequence of indices that contains all
consecutive indices in increasing order of first occurrence. If the alphabet of
a source that generated a sequence is unknown, the inevitable cost of coding
the unknown alphabet symbols can be exploited to create the pattern of the
sequence. This pattern can in turn be compressed by itself. It is shown that if
the alphabet size is essentially small, then the average minimax and
maximin redundancies as well as the redundancy of every code for almost every
source, when compressing a pattern, consist of at least 0.5 log(n/k^3) bits per
each unknown probability parameter, and if all alphabet letters are likely to
occur, there exist codes whose redundancy is at most 0.5 log(n/k^2) bits per
each unknown probability parameter, where n is the length of the data
sequences. Otherwise, if the alphabet is large, these redundancies are
essentially at least O(n^{-2/3}) bits per symbol, and there exist codes that
achieve redundancy of essentially O(n^{-1/2}) bits per symbol. Two sub-optimal
low-complexity sequential algorithms for compression of patterns are presented
and their description lengths analyzed, also pointing out that the pattern
average universal description length can decrease below the underlying i.i.d.\
entropy for large enough alphabets.Comment: Revised for IEEE Transactions on Information Theor
Barrier Frank-Wolfe for Marginal Inference
We introduce a globally-convergent algorithm for optimizing the
tree-reweighted (TRW) variational objective over the marginal polytope. The
algorithm is based on the conditional gradient method (Frank-Wolfe) and moves
pseudomarginals within the marginal polytope through repeated maximum a
posteriori (MAP) calls. This modular structure enables us to leverage black-box
MAP solvers (both exact and approximate) for variational inference, and obtains
more accurate results than tree-reweighted algorithms that optimize over the
local consistency relaxation. Theoretically, we bound the sub-optimality for
the proposed algorithm despite the TRW objective having unbounded gradients at
the boundary of the marginal polytope. Empirically, we demonstrate the
increased quality of results found by tightening the relaxation over the
marginal polytope as well as the spanning tree polytope on synthetic and
real-world instances.Comment: 25 pages, 12 figures, To appear in Neural Information Processing
Systems (NIPS) 2015, Corrected reference and cleaned up bibliograph
- …