167,200 research outputs found
Information-Theoretic Limits on Compression of Semantic Information
As conventional communication systems based on classic information theory
have closely approached the limits of Shannon channel capacity, semantic
communication has been recognized as a key enabling technology for the further
improvement of communication performance. However, it is still unsettled on how
to represent semantic information and characterise the theoretical limits. In
this paper, we consider a semantic source which consists of a set of correlated
random variables whose joint probabilistic distribution can be described by a
Bayesian network. Then we give the information-theoretic limit on the lossless
compression of the semantic source and introduce a low complexity encoding
method by exploiting the conditional independence. We further characterise the
limits on lossy compression of the semantic source and the corresponding upper
and lower bounds of the rate-distortion function. We also investigate the lossy
compression of the semantic source with side information at both the encoder
and decoder, and obtain the rate distortion function. We prove that the optimal
code of the semantic source is the combination of the optimal codes of each
conditional independent set given the side information
From Lujan to Laidlaw: A Preliminary Model of Environmental Standing
We consider the problem of code design for compression of correlated sources under adversarial attacks. A scenario with three correlated sources is considered in which at most one source is compromised by an adversary. The theoretical minimum achievable sum-rate for this scenario was derived by Kosut and Tong. We design layered LDPC convolutional codes for this problem, assuming that one of the sources is available at the common decoder as side information. We demonstrate that layered LDPC convolutional codes constitute a sequence of nested codes where each sub-code is capacity-achieving for the binary symmetric channels used to model the correlation between sources, and therefore, can ideally achieve the theoretical minimum sum-rate. Simulated performance results for moderate block length show a small gap to the theoretical limit, and as the block length increases the gap vanishes.QC 20130114</p
Image gathering and coding for digital restoration: Information efficiency and visual quality
Image gathering and coding are commonly treated as tasks separate from each other and from the digital processing used to restore and enhance the images. The goal is to develop a method that allows us to assess quantitatively the combined performance of image gathering and coding for the digital restoration of images with high visual quality. Digital restoration is often interactive because visual quality depends on perceptual rather than mathematical considerations, and these considerations vary with the target, the application, and the observer. The approach is based on the theoretical treatment of image gathering as a communication channel (J. Opt. Soc. Am. A2, 1644(1985);5,285(1988). Initial results suggest that the practical upper limit of the information contained in the acquired image data range typically from approximately 2 to 4 binary information units (bifs) per sample, depending on the design of the image-gathering system. The associated information efficiency of the transmitted data (i.e., the ratio of information over data) ranges typically from approximately 0.3 to 0.5 bif per bit without coding to approximately 0.5 to 0.9 bif per bit with lossless predictive compression and Huffman coding. The visual quality that can be attained with interactive image restoration improves perceptibly as the available information increases to approximately 3 bifs per sample. However, the perceptual improvements that can be attained with further increases in information are very subtle and depend on the target and the desired enhancement
Bounding generalization error with input compression: An empirical study with infinite-width networks
Estimating the Generalization Error (GE) of Deep Neural Networks (DNNs) is an
important task that often relies on availability of held-out data. The ability
to better predict GE based on a single training set may yield overarching DNN
design principles to reduce a reliance on trial-and-error, along with other
performance assessment advantages. In search of a quantity relevant to GE, we
investigate the Mutual Information (MI) between the input and final layer
representations, using the infinite-width DNN limit to bound MI. An existing
input compression-based GE bound is used to link MI and GE. To the best of our
knowledge, this represents the first empirical study of this bound. In our
attempt to empirically falsify the theoretical bound, we find that it is often
tight for best-performing models. Furthermore, it detects randomization of
training labels in many cases, reflects test-time perturbation robustness, and
works well given only few training samples. These results are promising given
that input compression is broadly applicable where MI can be estimated with
confidence.Comment: 12 pages main content, 26 pages tota
Statistical mechanics of lossy compression using multilayer perceptrons
Statistical mechanics is applied to lossy compression using multilayer
perceptrons for unbiased Boolean messages. We utilize a tree-like committee
machine (committee tree) and tree-like parity machine (parity tree) whose
transfer functions are monotonic. For compression using committee tree, a lower
bound of achievable distortion becomes small as the number of hidden units K
increases. However, it cannot reach the Shannon bound even where K -> infty.
For a compression using a parity tree with K >= 2 hidden units, the rate
distortion function, which is known as the theoretical limit for compression,
is derived where the code length becomes infinity.Comment: 12 pages, 5 figure
DRASIC: Distributed Recurrent Autoencoder for Scalable Image Compression
We propose a new architecture for distributed image compression from a group
of distributed data sources. The work is motivated by practical needs of
data-driven codec design, low power consumption, robustness, and data privacy.
The proposed architecture, which we refer to as Distributed Recurrent
Autoencoder for Scalable Image Compression (DRASIC), is able to train
distributed encoders and one joint decoder on correlated data sources. Its
compression capability is much better than the method of training codecs
separately. Meanwhile, the performance of our distributed system with 10
distributed sources is only within 2 dB peak signal-to-noise ratio (PSNR) of
the performance of a single codec trained with all data sources. We experiment
distributed sources with different correlations and show how our data-driven
methodology well matches the Slepian-Wolf Theorem in Distributed Source Coding
(DSC). To the best of our knowledge, this is the first data-driven DSC
framework for general distributed code design with deep learning
- …