4,793 research outputs found
Universal Lossless Compression with Unknown Alphabets - The Average Case
Universal compression of patterns of sequences generated by independently
identically distributed (i.i.d.) sources with unknown, possibly large,
alphabets is investigated. A pattern is a sequence of indices that contains all
consecutive indices in increasing order of first occurrence. If the alphabet of
a source that generated a sequence is unknown, the inevitable cost of coding
the unknown alphabet symbols can be exploited to create the pattern of the
sequence. This pattern can in turn be compressed by itself. It is shown that if
the alphabet size is essentially small, then the average minimax and
maximin redundancies as well as the redundancy of every code for almost every
source, when compressing a pattern, consist of at least 0.5 log(n/k^3) bits per
each unknown probability parameter, and if all alphabet letters are likely to
occur, there exist codes whose redundancy is at most 0.5 log(n/k^2) bits per
each unknown probability parameter, where n is the length of the data
sequences. Otherwise, if the alphabet is large, these redundancies are
essentially at least O(n^{-2/3}) bits per symbol, and there exist codes that
achieve redundancy of essentially O(n^{-1/2}) bits per symbol. Two sub-optimal
low-complexity sequential algorithms for compression of patterns are presented
and their description lengths analyzed, also pointing out that the pattern
average universal description length can decrease below the underlying i.i.d.\
entropy for large enough alphabets.Comment: Revised for IEEE Transactions on Information Theor
Compact Binary Relation Representations with Rich Functionality
Binary relations are an important abstraction arising in many data
representation problems. The data structures proposed so far to represent them
support just a few basic operations required to fit one particular application.
We identify many of those operations arising in applications and generalize
them into a wide set of desirable queries for a binary relation representation.
We also identify reductions among those operations. We then introduce several
novel binary relation representations, some simple and some quite
sophisticated, that not only are space-efficient but also efficiently support a
large subset of the desired queries.Comment: 32 page
Fast algorithms for computing the Boltzmann collision operator
The development of accurate and fast numerical schemes for the five fold
Boltzmann collision integral represents a challenging problem in scientific
computing. For a particular class of interactions, including the so-called hard
spheres model in dimension three, we are able to derive spectral methods that
can be evaluated through fast algorithms. These algorithms are based on a
suitable representation and approximation of the collision operator. Explicit
expressions for the errors in the schemes are given and spectral accuracy is
proved. Parallelization properties and adaptivity of the algorithms are also
discussed.Comment: 22 page
- …