2,555 research outputs found
Fast Two-Sample Testing with Analytic Representations of Probability Measures
We propose a class of nonparametric two-sample tests with a cost linear in
the sample size. Two tests are given, both based on an ensemble of distances
between analytic functions representing each of the distributions. The first
test uses smoothed empirical characteristic functions to represent the
distributions, the second uses distribution embeddings in a reproducing kernel
Hilbert space. Analyticity implies that differences in the distributions may be
detected almost surely at a finite number of randomly chosen
locations/frequencies. The new tests are consistent against a larger class of
alternatives than the previous linear-time tests based on the (non-smoothed)
empirical characteristic functions, while being much faster than the current
state-of-the-art quadratic-time kernel-based or energy distance-based tests.
Experiments on artificial benchmarks and on challenging real-world testing
problems demonstrate that our tests give a better power/time tradeoff than
competing approaches, and in some cases, better outright power than even the
most expensive quadratic-time tests. This performance advantage is retained
even in high dimensions, and in cases where the difference in distributions is
not observable with low order statistics
Measured descent: A new embedding method for finite metrics
We devise a new embedding technique, which we call measured descent, based on
decomposing a metric space locally, at varying speeds, according to the density
of some probability measure. This provides a refined and unified framework for
the two primary methods of constructing Frechet embeddings for finite metrics,
due to [Bourgain, 1985] and [Rao, 1999]. We prove that any n-point metric space
(X,d) embeds in Hilbert space with distortion O(sqrt{alpha_X log n}), where
alpha_X is a geometric estimate on the decomposability of X. As an immediate
corollary, we obtain an O(sqrt{(log lambda_X) \log n}) distortion embedding,
where \lambda_X is the doubling constant of X. Since \lambda_X\le n, this
result recovers Bourgain's theorem, but when the metric X is, in a sense,
``low-dimensional,'' improved bounds are achieved.
Our embeddings are volume-respecting for subsets of arbitrary size. One
consequence is the existence of (k, O(log n)) volume-respecting embeddings for
all 1 \leq k \leq n, which is the best possible, and answers positively a
question posed by U. Feige. Our techniques are also used to answer positively a
question of Y. Rabinovich, showing that any weighted n-point planar graph
embeds in l_\infty^{O(log n)} with O(1) distortion. The O(log n) bound on the
dimension is optimal, and improves upon the previously known bound of O((log
n)^2).Comment: 17 pages. No figures. Appeared in FOCS '04. To appeaer in Geometric &
Functional Analysis. This version fixes a subtle error in Section 2.
Distance covariance in metric spaces
We extend the theory of distance (Brownian) covariance from Euclidean spaces,
where it was introduced by Sz\'{e}kely, Rizzo and Bakirov, to general metric
spaces. We show that for testing independence, it is necessary and sufficient
that the metric space be of strong negative type. In particular, we show that
this holds for separable Hilbert spaces, which answers a question of Kosorok.
Instead of the manipulations of Fourier transforms used in the original work,
we use elementary inequalities for metric spaces and embeddings in Hilbert
spaces.Comment: Published in at http://dx.doi.org/10.1214/12-AOP803 the Annals of
Probability (http://www.imstat.org/aop/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Kernel Distribution Embeddings: Universal Kernels, Characteristic Kernels and Kernel Metrics on Distributions
Kernel mean embeddings have recently attracted the attention of the machine
learning community. They map measures from some set to functions in a
reproducing kernel Hilbert space (RKHS) with kernel . The RKHS distance of
two mapped measures is a semi-metric over . We study three questions.
(I) For a given kernel, what sets can be embedded? (II) When is the
embedding injective over (in which case is a metric)? (III) How does
the -induced topology compare to other topologies on ? The existing
machine learning literature has addressed these questions in cases where is
(a subset of) the finite regular Borel measures. We unify, improve and
generalise those results. Our approach naturally leads to continuous and
possibly even injective embeddings of (Schwartz-) distributions, i.e.,
generalised measures, but the reader is free to focus on measures only. In
particular, we systemise and extend various (partly known) equivalences between
different notions of universal, characteristic and strictly positive definite
kernels, and show that on an underlying locally compact Hausdorff space,
metrises the weak convergence of probability measures if and only if is
continuous and characteristic.Comment: Old and longer version of the JMLR paper with same title (published
2018). Please start with the JMLR version. 55 pages (33 pages main text, 22
pages appendix), 2 tables, 1 figure (in appendix
- …