2,555 research outputs found

    Fast Two-Sample Testing with Analytic Representations of Probability Measures

    Full text link
    We propose a class of nonparametric two-sample tests with a cost linear in the sample size. Two tests are given, both based on an ensemble of distances between analytic functions representing each of the distributions. The first test uses smoothed empirical characteristic functions to represent the distributions, the second uses distribution embeddings in a reproducing kernel Hilbert space. Analyticity implies that differences in the distributions may be detected almost surely at a finite number of randomly chosen locations/frequencies. The new tests are consistent against a larger class of alternatives than the previous linear-time tests based on the (non-smoothed) empirical characteristic functions, while being much faster than the current state-of-the-art quadratic-time kernel-based or energy distance-based tests. Experiments on artificial benchmarks and on challenging real-world testing problems demonstrate that our tests give a better power/time tradeoff than competing approaches, and in some cases, better outright power than even the most expensive quadratic-time tests. This performance advantage is retained even in high dimensions, and in cases where the difference in distributions is not observable with low order statistics

    Measured descent: A new embedding method for finite metrics

    Full text link
    We devise a new embedding technique, which we call measured descent, based on decomposing a metric space locally, at varying speeds, according to the density of some probability measure. This provides a refined and unified framework for the two primary methods of constructing Frechet embeddings for finite metrics, due to [Bourgain, 1985] and [Rao, 1999]. We prove that any n-point metric space (X,d) embeds in Hilbert space with distortion O(sqrt{alpha_X log n}), where alpha_X is a geometric estimate on the decomposability of X. As an immediate corollary, we obtain an O(sqrt{(log lambda_X) \log n}) distortion embedding, where \lambda_X is the doubling constant of X. Since \lambda_X\le n, this result recovers Bourgain's theorem, but when the metric X is, in a sense, ``low-dimensional,'' improved bounds are achieved. Our embeddings are volume-respecting for subsets of arbitrary size. One consequence is the existence of (k, O(log n)) volume-respecting embeddings for all 1 \leq k \leq n, which is the best possible, and answers positively a question posed by U. Feige. Our techniques are also used to answer positively a question of Y. Rabinovich, showing that any weighted n-point planar graph embeds in l_\infty^{O(log n)} with O(1) distortion. The O(log n) bound on the dimension is optimal, and improves upon the previously known bound of O((log n)^2).Comment: 17 pages. No figures. Appeared in FOCS '04. To appeaer in Geometric & Functional Analysis. This version fixes a subtle error in Section 2.

    Distance covariance in metric spaces

    Full text link
    We extend the theory of distance (Brownian) covariance from Euclidean spaces, where it was introduced by Sz\'{e}kely, Rizzo and Bakirov, to general metric spaces. We show that for testing independence, it is necessary and sufficient that the metric space be of strong negative type. In particular, we show that this holds for separable Hilbert spaces, which answers a question of Kosorok. Instead of the manipulations of Fourier transforms used in the original work, we use elementary inequalities for metric spaces and embeddings in Hilbert spaces.Comment: Published in at http://dx.doi.org/10.1214/12-AOP803 the Annals of Probability (http://www.imstat.org/aop/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Kernel Distribution Embeddings: Universal Kernels, Characteristic Kernels and Kernel Metrics on Distributions

    Full text link
    Kernel mean embeddings have recently attracted the attention of the machine learning community. They map measures ÎĽ\mu from some set MM to functions in a reproducing kernel Hilbert space (RKHS) with kernel kk. The RKHS distance of two mapped measures is a semi-metric dkd_k over MM. We study three questions. (I) For a given kernel, what sets MM can be embedded? (II) When is the embedding injective over MM (in which case dkd_k is a metric)? (III) How does the dkd_k-induced topology compare to other topologies on MM? The existing machine learning literature has addressed these questions in cases where MM is (a subset of) the finite regular Borel measures. We unify, improve and generalise those results. Our approach naturally leads to continuous and possibly even injective embeddings of (Schwartz-) distributions, i.e., generalised measures, but the reader is free to focus on measures only. In particular, we systemise and extend various (partly known) equivalences between different notions of universal, characteristic and strictly positive definite kernels, and show that on an underlying locally compact Hausdorff space, dkd_k metrises the weak convergence of probability measures if and only if kk is continuous and characteristic.Comment: Old and longer version of the JMLR paper with same title (published 2018). Please start with the JMLR version. 55 pages (33 pages main text, 22 pages appendix), 2 tables, 1 figure (in appendix
    • …
    corecore