154 research outputs found
Kernel Distribution Embeddings: Universal Kernels, Characteristic Kernels and Kernel Metrics on Distributions
Kernel mean embeddings have recently attracted the attention of the machine
learning community. They map measures from some set to functions in a
reproducing kernel Hilbert space (RKHS) with kernel . The RKHS distance of
two mapped measures is a semi-metric over . We study three questions.
(I) For a given kernel, what sets can be embedded? (II) When is the
embedding injective over (in which case is a metric)? (III) How does
the -induced topology compare to other topologies on ? The existing
machine learning literature has addressed these questions in cases where is
(a subset of) the finite regular Borel measures. We unify, improve and
generalise those results. Our approach naturally leads to continuous and
possibly even injective embeddings of (Schwartz-) distributions, i.e.,
generalised measures, but the reader is free to focus on measures only. In
particular, we systemise and extend various (partly known) equivalences between
different notions of universal, characteristic and strictly positive definite
kernels, and show that on an underlying locally compact Hausdorff space,
metrises the weak convergence of probability measures if and only if is
continuous and characteristic.Comment: Old and longer version of the JMLR paper with same title (published
2018). Please start with the JMLR version. 55 pages (33 pages main text, 22
pages appendix), 2 tables, 1 figure (in appendix
A Primer on Reproducing Kernel Hilbert Spaces
Reproducing kernel Hilbert spaces are elucidated without assuming prior
familiarity with Hilbert spaces. Compared with extant pedagogic material,
greater care is placed on motivating the definition of reproducing kernel
Hilbert spaces and explaining when and why these spaces are efficacious. The
novel viewpoint is that reproducing kernel Hilbert space theory studies
extrinsic geometry, associating with each geometric configuration a canonical
overdetermined coordinate system. This coordinate system varies continuously
with changing geometric configurations, making it well-suited for studying
problems whose solutions also vary continuously with changing geometry. This
primer can also serve as an introduction to infinite-dimensional linear algebra
because reproducing kernel Hilbert spaces have more properties in common with
Euclidean spaces than do more general Hilbert spaces.Comment: Revised version submitted to Foundations and Trends in Signal
Processin
Fast Two-Sample Testing with Analytic Representations of Probability Measures
We propose a class of nonparametric two-sample tests with a cost linear in
the sample size. Two tests are given, both based on an ensemble of distances
between analytic functions representing each of the distributions. The first
test uses smoothed empirical characteristic functions to represent the
distributions, the second uses distribution embeddings in a reproducing kernel
Hilbert space. Analyticity implies that differences in the distributions may be
detected almost surely at a finite number of randomly chosen
locations/frequencies. The new tests are consistent against a larger class of
alternatives than the previous linear-time tests based on the (non-smoothed)
empirical characteristic functions, while being much faster than the current
state-of-the-art quadratic-time kernel-based or energy distance-based tests.
Experiments on artificial benchmarks and on challenging real-world testing
problems demonstrate that our tests give a better power/time tradeoff than
competing approaches, and in some cases, better outright power than even the
most expensive quadratic-time tests. This performance advantage is retained
even in high dimensions, and in cases where the difference in distributions is
not observable with low order statistics
Minimax Estimation of Kernel Mean Embeddings
In this paper, we study the minimax estimation of the Bochner integral
also called as the kernel
mean embedding, based on random samples drawn i.i.d.~from , where
is a positive definite
kernel. Various estimators (including the empirical estimator),
of are studied in the literature wherein all of
them satisfy with
being the reproducing kernel Hilbert space induced by . The
main contribution of the paper is in showing that the above mentioned rate of
is minimax in and
-norms over the class of discrete measures and
the class of measures that has an infinitely differentiable density, with
being a continuous translation-invariant kernel on . The
interesting aspect of this result is that the minimax rate is independent of
the smoothness of the kernel and the density of (if it exists). This result
has practical consequences in statistical applications as the mean embedding
has been widely employed in non-parametric hypothesis testing, density
estimation, causal inference and feature selection, through its relation to
energy distance (and distance covariance)
- …