1,163 research outputs found
Convergence of Smoothed Empirical Measures with Applications to Entropy Estimation
This paper studies convergence of empirical measures smoothed by a Gaussian
kernel. Specifically, consider approximating , for
, by
, where is the empirical measure,
under different statistical distances. The convergence is examined in terms of
the Wasserstein distance, total variation (TV), Kullback-Leibler (KL)
divergence, and -divergence. We show that the approximation error under
the TV distance and 1-Wasserstein distance () converges at rate
in remarkable contrast to a typical
rate for unsmoothed (and ). For the
KL divergence, squared 2-Wasserstein distance (), and
-divergence, the convergence rate is , but only if
achieves finite input-output mutual information across the additive
white Gaussian noise channel. If the latter condition is not met, the rate
changes to for the KL divergence and , while
the -divergence becomes infinite - a curious dichotomy. As a main
application we consider estimating the differential entropy
in the high-dimensional regime. The distribution
is unknown but i.i.d samples from it are available. We first show that
any good estimator of must have sample complexity
that is exponential in . Using the empirical approximation results we then
show that the absolute-error risk of the plug-in estimator converges at the
parametric rate , thus establishing the minimax
rate-optimality of the plug-in. Numerical results that demonstrate a
significant empirical superiority of the plug-in approach to general-purpose
differential entropy estimators are provided.Comment: arXiv admin note: substantial text overlap with arXiv:1810.1158
Identifiability and consistent estimation of nonparametric translation hidden Markov models with general state space
This paper considers hidden Markov models where the observations are given as
the sum of a latent state which lies in a general state space and some
independent noise with unknown distribution. It is shown that these fully
nonparametric translation models are identifiable with respect to both the
distribution of the latent variables and the distribution of the noise, under
mostly a light tail assumption on the latent variables. Two nonparametric
estimation methods are proposed and we prove that the corresponding estimators
are consistent for the weak convergence topology. These results are illustrated
with numerical experiments
- …