6,320 research outputs found
A Universal Parallel Two-Pass MDL Context Tree Compression Algorithm
Computing problems that handle large amounts of data necessitate the use of
lossless data compression for efficient storage and transmission. We present a
novel lossless universal data compression algorithm that uses parallel
computational units to increase the throughput. The length- input sequence
is partitioned into blocks. Processing each block independently of the
other blocks can accelerate the computation by a factor of , but degrades
the compression quality. Instead, our approach is to first estimate the minimum
description length (MDL) context tree source underlying the entire input, and
then encode each of the blocks in parallel based on the MDL source. With
this two-pass approach, the compression loss incurred by using more parallel
units is insignificant. Our algorithm is work-efficient, i.e., its
computational complexity is . Its redundancy is approximately
bits above Rissanen's lower bound on universal compression
performance, with respect to any context tree source whose maximal depth is at
most . We improve the compression by using different quantizers for
states of the context tree based on the number of symbols corresponding to
those states. Numerical results from a prototype implementation suggest that
our algorithm offers a better trade-off between compression and throughput than
competing universal data compression algorithms.Comment: Accepted to Journal of Selected Topics in Signal Processing special
issue on Signal Processing for Big Data (expected publication date June
2015). 10 pages double column, 6 figures, and 2 tables. arXiv admin note:
substantial text overlap with arXiv:1405.6322. Version: Mar 2015: Corrected a
typ
Multiresolution vector quantization
Multiresolution source codes are data compression algorithms yielding embedded source descriptions. The decoder of a multiresolution code can build a source reproduction by decoding the embedded bit stream in part or in whole. All decoding procedures start at the beginning of the binary source description and decode some fraction of that string. Decoding a small portion of the binary string gives a low-resolution reproduction; decoding more yields a higher resolution reproduction; and so on. Multiresolution vector quantizers are block multiresolution source codes. This paper introduces algorithms for designing fixed- and variable-rate multiresolution vector quantizers. Experiments on synthetic data demonstrate performance close to the theoretical performance limit. Experiments on natural images demonstrate performance improvements of up to 8 dB over tree-structured vector quantizers. Some of the lessons learned through multiresolution vector quantizer design lend insight into the design of more sophisticated multiresolution codes
Image Characterization and Classification by Physical Complexity
We present a method for estimating the complexity of an image based on
Bennett's concept of logical depth. Bennett identified logical depth as the
appropriate measure of organized complexity, and hence as being better suited
to the evaluation of the complexity of objects in the physical world. Its use
results in a different, and in some sense a finer characterization than is
obtained through the application of the concept of Kolmogorov complexity alone.
We use this measure to classify images by their information content. The method
provides a means for classifying and evaluating the complexity of objects by
way of their visual representations. To the authors' knowledge, the method and
application inspired by the concept of logical depth presented herein are being
proposed and implemented for the first time.Comment: 30 pages, 21 figure
- …