11 research outputs found
Estimation of the Rate-Distortion Function
Motivated by questions in lossy data compression and by theoretical
considerations, we examine the problem of estimating the rate-distortion
function of an unknown (not necessarily discrete-valued) source from empirical
data. Our focus is the behavior of the so-called "plug-in" estimator, which is
simply the rate-distortion function of the empirical distribution of the
observed data. Sufficient conditions are given for its consistency, and
examples are provided to demonstrate that in certain cases it fails to converge
to the true rate-distortion function. The analysis of its performance is
complicated by the fact that the rate-distortion function is not continuous in
the source distribution; the underlying mathematical problem is closely related
to the classical problem of establishing the consistency of maximum likelihood
estimators. General consistency results are given for the plug-in estimator
applied to a broad class of sources, including all stationary and ergodic ones.
A more general class of estimation problems is also considered, arising in the
context of lossy data compression when the allowed class of coding
distributions is restricted; analogous results are developed for the plug-in
estimator in that case. Finally, consistency theorems are formulated for
modified (e.g., penalized) versions of the plug-in, and for estimating the
optimal reproduction distribution.Comment: 18 pages, no figures [v2: removed an example with an error; corrected
typos; a shortened version will appear in IEEE Trans. Inform. Theory
Neural Estimation of the Rate-Distortion Function With Applications to Operational Source Coding
A fundamental question in designing lossy data compression schemes is how
well one can do in comparison with the rate-distortion function, which
describes the known theoretical limits of lossy compression. Motivated by the
empirical success of deep neural network (DNN) compressors on large, real-world
data, we investigate methods to estimate the rate-distortion function on such
data, which would allow comparison of DNN compressors with optimality. While
one could use the empirical distribution of the data and apply the
Blahut-Arimoto algorithm, this approach presents several computational
challenges and inaccuracies when the datasets are large and high-dimensional,
such as the case of modern image datasets. Instead, we re-formulate the
rate-distortion objective, and solve the resulting functional optimization
problem using neural networks. We apply the resulting rate-distortion
estimator, called NERD, on popular image datasets, and provide evidence that
NERD can accurately estimate the rate-distortion function. Using our estimate,
we show that the rate-distortion achievable by DNN compressors are within
several bits of the rate-distortion function for real-world datasets.
Additionally, NERD provides access to the rate-distortion achieving channel, as
well as samples from its output marginal. Therefore, using recent results in
reverse channel coding, we describe how NERD can be used to construct an
operational one-shot lossy compression scheme with guarantees on the achievable
rate and distortion. Experimental results demonstrate competitive performance
with DNN compressors
Maximal-Capacity Discrete Memoryless Channel Identification
The problem of identifying the channel with the highest capacity among
several discrete memoryless channels (DMCs) is considered. The problem is cast
as a pure-exploration multi-armed bandit problem, which follows the practical
use of training sequences to sense the communication channel statistics. A
capacity estimator is proposed and tight confidence bounds on the estimator
error are derived. Based on this capacity estimator, a gap-elimination
algorithm termed BestChanID is proposed, which is oblivious to the
capacity-achieving input distribution and is guaranteed to output the DMC with
the largest capacity, with a desired confidence. Furthermore, two additional
algorithms NaiveChanSel and MedianChanEl, that output with certain confidence a
DMC with capacity close to the maximal, are introduced. Each of those
algorithms is beneficial in a different regime and can be used as a subroutine
in BestChanID. The sample complexity of all algorithms is analyzed as a
function of the desired confidence parameter, the number of channels, and the
channels' input and output alphabet sizes. The cost of best channel
identification is shown to scale quadratically with the alphabet size, and a
fundamental lower bound for the required number of channel senses to identify
the best channel with a certain confidence is derived
Estimating the Rate-Distortion Function by Wasserstein Gradient Descent
In the theory of lossy compression, the rate-distortion (R-D) function
describes how much a data source can be compressed (in bit-rate) at any given
level of fidelity (distortion). Obtaining for a given data source
establishes the fundamental performance limit for all compression algorithms.
We propose a new method to estimate from the perspective of optimal
transport. Unlike the classic Blahut--Arimoto algorithm which fixes the support
of the reproduction distribution in advance, our Wasserstein gradient descent
algorithm learns the support of the optimal reproduction distribution by moving
particles. We prove its local convergence and analyze the sample complexity of
our R-D estimator based on a connection to entropic optimal transport.
Experimentally, we obtain comparable or tighter bounds than state-of-the-art
neural network methods on low-rate sources while requiring considerably less
tuning and computation effort. We also highlight a connection to
maximum-likelihood deconvolution and introduce a new class of sources that can
be used as test cases with known solutions to the R-D problem.Comment: Accepted as conference paper at NeurIPS 202
Estimation of the rate-distortion function
Motivated by questions in lossy data compression and by theoretical considerations, the problem of estimating the rate-distortion function of an unknown (not necessarily discrete-valued) source from empirical data is examined. The focus is the behavior of the so-called "plug-in" estimator, which is simply the rate-distortion function of the empirical distribution of the observed data. Sufficient conditions are given for its consistency, and examples are provided demonstrating that in certain cases it fails to converge to the true rate-distortion function. The analysis of its performance is complicated by the fact that the rate-distortion function is not continuous in the source distribution; the underlying mathematical problem is closely related to the classical problem of establishing the consistency of maximum-likelihood estimators (MLEs). General consistency results are given for the plug-in estimator applied to a broad class of sources, including all stationary and ergodic ones. A more general class of estimation problems is also considered, arising in the context of lossy data compression when the allowed class of coding distributions is restricted; analogous results are developed for the plug-in estimator in that case. Finally, consistency theorems are formulated for modified (e.g., penalized) versions of the plug-in, and for estimating the optimal reproduction distribution. © 2008 IEEE