10 research outputs found
A Probabilistic Upper Bound on Differential Entropy
A novel, non-trivial, probabilistic upper bound on the entropy of an unknown
one-dimensional distribution, given the support of the distribution and a
sample from that distribution, is presented. No knowledge beyond the support of
the unknown distribution is required, nor is the distribution required to have
a density. Previous distribution-free bounds on the cumulative distribution
function of a random variable given a sample of that variable are used to
construct the bound. A simple, fast, and intuitive algorithm for computing the
entropy bound from a sample is provided
Guaranteed bounds on the Kullback-Leibler divergence of univariate mixtures using piecewise log-sum-exp inequalities
Information-theoretic measures such as the entropy, cross-entropy and the
Kullback-Leibler divergence between two mixture models is a core primitive in
many signal processing tasks. Since the Kullback-Leibler divergence of mixtures
provably does not admit a closed-form formula, it is in practice either
estimated using costly Monte-Carlo stochastic integration, approximated, or
bounded using various techniques. We present a fast and generic method that
builds algorithmically closed-form lower and upper bounds on the entropy, the
cross-entropy and the Kullback-Leibler divergence of mixtures. We illustrate
the versatile method by reporting on our experiments for approximating the
Kullback-Leibler divergence between univariate exponential mixtures, Gaussian
mixtures, Rayleigh mixtures, and Gamma mixtures.Comment: 20 pages, 3 figure
Determining the Number of Samples Required to Estimate Entropy in Natural Sequences
Calculating the Shannon entropy for symbolic sequences has been widely
considered in many fields. For descriptive statistical problems such as
estimating the N-gram entropy of English language text, a common approach is to
use as much data as possible to obtain progressively more accurate estimates.
However in some instances, only short sequences may be available. This gives
rise to the question of how many samples are needed to compute entropy. In this
paper, we examine this problem and propose a method for estimating the number
of samples required to compute Shannon entropy for a set of ranked symbolic
natural events. The result is developed using a modified Zipf-Mandelbrot law
and the Dvoretzky-Kiefer-Wolfowitz inequality, and we propose an algorithm
which yields an estimate for the minimum number of samples required to obtain
an estimate of entropy with a given confidence level and degree of accuracy
A Probabilistic Upper Bound on Differential Entropy
Abstract β The differential entropy is a quantity employed ubiquitously in communications, statistical learning, physics, and many other fields. We present, to our knowledge, the first non-trivial probabilistic upper bound on the entropy of an unknown one-dimensional distribution, given the support of the distribution and a sample from that distribution. The bound is completely general in that it does not depend in any way on the form of the unknown distribution (among other things, it does not require that the distribution have a density). Our bound uses previous distribution-free bounds on the cumulative distribution function of a random variable given a sample of that variable. We provide a simple, fast, and intuitive algorithm for computing the entropy bound from a sample. I
A Probabilistic Upper Bound on Differential Entropy
Abstract β A novel, non-trivial, probabilistic upper bound on the entropy of an unknown one-dimensional distribution, given the support of the distribution and a sample from that distribution, is presented. No knowledge beyond the support of the unknown distribution is required. Previous distribution-free bounds on the cumulative distribution function of a random variable given a sample of that variable are used to construct the bound. A simple, fast, and intuitive algorithm for computing the entropy bound from a sample is provided. I