34 research outputs found
Demystifying Fixed k-Nearest Neighbor Information Estimators
Estimating mutual information from i.i.d. samples drawn from an unknown joint
density function is a basic statistical problem of broad interest with
multitudinous applications. The most popular estimator is one proposed by
Kraskov and St\"ogbauer and Grassberger (KSG) in 2004, and is nonparametric and
based on the distances of each sample to its nearest neighboring
sample, where is a fixed small integer. Despite its widespread use (part of
scientific software packages), theoretical properties of this estimator have
been largely unexplored. In this paper we demonstrate that the estimator is
consistent and also identify an upper bound on the rate of convergence of the
bias as a function of number of samples. We argue that the superior performance
benefits of the KSG estimator stems from a curious "correlation boosting"
effect and build on this intuition to modify the KSG estimator in novel ways to
construct a superior estimator. As a byproduct of our investigations, we obtain
nearly tight rates of convergence of the error of the well known fixed
nearest neighbor estimator of differential entropy by Kozachenko and
Leonenko.Comment: 55 pages, 8 figure
Conditional Mutual Information Neural Estimator
Several recent works in communication systems have proposed to leverage the
power of neural networks in the design of encoders and decoders. In this
approach, these blocks can be tailored to maximize the transmission rate based
on aggregated samples from the channel. Motivated by the fact that, in many
communication schemes, the achievable transmission rate is determined by a
conditional mutual information term, this paper focuses on neural-based
estimators for this information-theoretic quantity. Our results are based on
variational bounds for the KL-divergence and, in contrast to some previous
works, we provide a mathematically rigorous lower bound. However, additional
challenges with respect to the unconditional mutual information emerge due to
the presence of a conditional density function which we address here.Comment: To be presented at ICASSP 202