Search CORE

53,823 research outputs found

Distance covariance in metric spaces

Author: Lyons Russell
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 03/10/2013
Field of study

We extend the theory of distance (Brownian) covariance from Euclidean spaces, where it was introduced by Sz\'{e}kely, Rizzo and Bakirov, to general metric spaces. We show that for testing independence, it is necessary and sufficient that the metric space be of strong negative type. In particular, we show that this holds for separable Hilbert spaces, which answers a question of Kosorok. Instead of the manipulations of Fourier transforms used in the original work, we use elementary inequalities for metric spaces and embeddings in Hilbert spaces.Comment: Published in at http://dx.doi.org/10.1214/12-AOP803 the Annals of Probability (http://www.imstat.org/aop/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Crossref

Fast Two-Sample Testing with Analytic Representations of Probability Measures

Author: Chwialkowski Kacper
Gretton Arthur
Ramdas Aaditya
Sejdinovic Dino
Publication venue
Publication date: 01/01/2015
Field of study

We propose a class of nonparametric two-sample tests with a cost linear in the sample size. Two tests are given, both based on an ensemble of distances between analytic functions representing each of the distributions. The first test uses smoothed empirical characteristic functions to represent the distributions, the second uses distribution embeddings in a reproducing kernel Hilbert space. Analyticity implies that differences in the distributions may be detected almost surely at a finite number of randomly chosen locations/frequencies. The new tests are consistent against a larger class of alternatives than the previous linear-time tests based on the (non-smoothed) empirical characteristic functions, while being much faster than the current state-of-the-art quadratic-time kernel-based or energy distance-based tests. Experiments on artificial benchmarks and on challenging real-world testing problems demonstrate that our tests give a better power/time tradeoff than competing approaches, and in some cases, better outright power than even the most expensive quadratic-time tests. This performance advantage is retained even in high dimensions, and in cases where the difference in distributions is not observable with low order statistics

arXiv.org e-Print Archive

Oxford University Research Archive

Statistical Methods in Topological Data Analysis for Complex, High-Dimensional Data

Author: Doerge R. W.
Medina Patrick S.
Publication venue
Publication date: 01/01/2015
Field of study

The utilization of statistical methods an their applications within the new field of study known as Topological Data Analysis has has tremendous potential for broadening our exploration and understanding of complex, high-dimensional data spaces. This paper provides an introductory overview of the mathematical underpinnings of Topological Data Analysis, the workflow to convert samples of data to topological summary statistics, and some of the statistical methods developed for performing inference on these topological summary statistics. The intention of this non-technical overview is to motivate statisticians who are interested in learning more about the subject.Comment: 15 pages, 7 Figures, 27th Annual Conference on Applied Statistics in Agricultur

arXiv.org e-Print Archive

Kansas State University

Metric Semantics and Full Abstractness for Action Refinement and Probabilistic Choice

Author: Bakker J.W. de
Hartog J.I. den
Vink E.P. de
Publication venue: Elsevier Science
Publication date: 01/01/2001
Field of study

This paper provides a case-study in the field of metric semantics for probabilistic programming. Both an operational and a denotational semantics are presented for an abstract process language L_pr, which features action refinement and probabilistic choice. The two models are constructed in the setting of complete ultrametric spaces, here based on probability measures of compact support over sequences of actions. It is shown that the standard toolkit for metric semantics works well in the probabilistic context of L_pr, e.g. in establishing the correctness of the denotational semantics with respect to the operational one. In addition, it is shown how the method of proving full abstraction --as proposed recently by the authors for a nondeterministic language with action refinement-- can be adapted to deal with the probabilistic language L_pr as well

CiteSeerX

Elsevier - Publisher Connector

Pure OAI Repository

University of Twente Research Information