90 research outputs found
Informational Divergence and Entropy Rate on Rooted Trees with Probabilities
Rooted trees with probabilities are used to analyze properties of a variable
length code. A bound is derived on the difference between the entropy rates of
the code and a memoryless source. The bound is in terms of normalized
informational divergence. The bound is used to derive converses for exact
random number generation, resolution coding, and distribution matching.Comment: 5 pages. With proofs and illustrating exampl
Fixed-to-Variable Length Distribution Matching
Fixed-to-variable length (f2v) matchers are used to reversibly transform an
input sequence of independent and uniformly distributed bits into an output
sequence of bits that are (approximately) independent and distributed according
to a target distribution. The degree of approximation is measured by the
informational divergence between the output distribution and the target
distribution. An algorithm is developed that efficiently finds optimal f2v
codes. It is shown that by encoding the input bits blockwise, the informational
divergence per bit approaches zero as the block length approaches infinity. A
relation to data compression by Tunstall coding is established.Comment: 5 pages, essentially the ISIT 2013 versio
Information Rates and Error Exponents for Probabilistic Amplitude Shaping
Probabilistic Amplitude Shaping (PAS) is a coded-modulation scheme in which
the encoder is a concatenation of a distribution matcher with a systematic
Forward Error Correction (FEC) code. For reduced computational complexity the
decoder can be chosen as a concatenation of a mismatched FEC decoder and
dematcher. This work studies the theoretic limits of PAS. The classical joint
source-channel coding (JSCC) setup is modified to include systematic FEC and
the mismatched FEC decoder. At each step error exponents and achievable rates
for the corresponding setup are derived.Comment: Shortened version submitted to Information Theory Workshop (ITW) 201
Understanding Individual Neuron Importance Using Information Theory
In this work, we investigate the use of three information-theoretic
quantities -- entropy, mutual information with the class variable, and a class
selectivity measure based on Kullback-Leibler divergence -- to understand and
study the behavior of already trained fully-connected feed-forward neural
networks. We analyze the connection between these information-theoretic
quantities and classification performance on the test set by cumulatively
ablating neurons in networks trained on MNIST, FashionMNIST, and CIFAR-10. Our
results parallel those recently published by Morcos et al., indicating that
class selectivity is not a good indicator for classification performance.
However, looking at individual layers separately, both mutual information and
class selectivity are positively correlated with classification performance, at
least for networks with ReLU activation functions. We provide explanations for
this phenomenon and conclude that it is ill-advised to compare the proposed
information-theoretic quantities across layers. Finally, we briefly discuss
future prospects of employing information-theoretic quantities for different
purposes, including neuron pruning and studying the effect that different
regularizers and architectures have on the trained neural network. We also draw
connections to the information bottleneck theory of neural networks.Comment: 30 page
- …