3,121 research outputs found
Advances in Non-parametric Hypothesis Testing with Kernels
Non-parametric statistical hypothesis testing procedures aim to distinguish the null hypothesis against the alternative with minimal assumptions on the model distributions. In recent years, the maximum mean discrepancy (MMD) has been developed as a measure to compare two distributions, which is applicable to two-sample problems and independence tests. With the aid of reproducing kernel Hilbert spaces (RKHS) that are rich-enough, MMD enjoys desirable statistical properties including characteristics, consistency, and maximal test power. Moreover, MMD receives empirical successes in complex tasks such as training and comparing generative models. Stein’s method also provides an elegant probabilistic tool to compare unnormalised distributions, which commonly appear in practical machine learning tasks. Combined with rich-enough RKHS, the kernel Stein discrepancy (KSD) has been developed as a proper discrepancy measure between distributions, which can be used to tackle one-sample problems (or goodness-of-fit tests). The existing development of KSD applies to a limited choice of domains, such as Euclidean space or finite discrete sets, and requires complete data observations, while the current MMD constructions are limited by the choice of simple kernels where the power of the tests suffer, e.g. high-dimensional image data. The main focus of this thesis is on the further advancement of kernel-based statistics for hypothesis testings. Firstly, Stein operators are developed that are compatible with broader data domains to perform the corresponding goodness-of-fit tests. Goodness-of-fit tests for general unnormalised densities on Riemannian manifolds, which are of the non-Euclidean topology, have been developed. In addition, novel non-parametric goodness-of-fit tests for data with censoring are studied. Then the tests for data observations with left truncation are studied, e.g. times of entering the hospital always happen before death time in the hospital, and we say the death time is truncated by the entering time. We test the notion of independence beyond truncation by proposing a kernelised measure for quasi-independence. Finally, we study the deep kernel architectures to improve the two-sample testing performances
Stein operators, kernels and discrepancies for multivariate continuous distributions
We present a general framework for setting up Stein's method for multivariate continuous distributions. The approach gives a collection of Stein characterizations, among which we highlight score-Stein operators and kernel-Stein operators. Applications include copu-las and distance between posterior distributions. We give a general explicit construction for Stein kernels for elliptical distributions and discuss Stein kernels in generality, highlighting connections with Fisher information and mass transport. Finally, a goodness-of-fit test based on Stein discrepancies is given
A Riemannian-Stein Kernel Method
This paper presents a theoretical analysis of numerical integration based on
interpolation with a Stein kernel. In particular, the case of integrals with
respect to a posterior distribution supported on a general Riemannian manifold
is considered and the asymptotic convergence of the estimator in this context
is established. Our results are considerably stronger than those previously
reported, in that the optimal rate of convergence is established under a basic
Sobolev-type assumption on the integrand. The theoretical results are
empirically verified on
Composite Goodness-of-fit Tests with Kernels
Model misspecification can create significant challenges for the implementation of probabilistic models, and this has led to development of a range of inference methods which directly account for this issue. However, whether these more involved methods are required will depend on whether the model is really misspecified, and there is a lack of generally applicable methods to answer this question. One set of tools which can help are goodness-of-fit tests, where we test whether a dataset could have been generated by a fixed distribution. Kernel-based tests have been developed to for this problem, and these are popular due to their flexibility, strong theoretical guarantees and ease of implementation in a wide range of scenarios. In this paper, we extend this line of work to the more challenging composite goodness-of-fit problem, where we are instead interested in whether the data comes from any distribution in some parametric family. This is equivalent to testing whether a parametric model is well-specified for the data
A Kernel Stein Test of Goodness of Fit for Sequential Models
We propose a goodness-of-fit measure for probability densities modeling observations with varying dimensionality, such as text documents of differing lengths or variable-length sequences. The
proposed measure is an instance of the kernel
Stein discrepancy (KSD), which has been used
to construct goodness-of-fit tests for unnormalized densities. The KSD is defined by its Stein
operator: current operators used in testing apply to fixed-dimensional spaces. As our main
contribution, we extend the KSD to the variabledimension setting by identifying appropriate Stein
operators, and propose a novel KSD goodness-offit test. As with the previous variants, the proposed KSD does not require the density to be normalized, allowing the evaluation of a large class
of models. Our test is shown to perform well in
practice on discrete sequential data benchmarks
A unified approach to goodness-of-fit testing for spherical and hyperspherical data
We propose a general and relatively simple method for the construction of
goodness-of-fit tests on the sphere and the hypersphere. The method is based on
the characterization of probability distributions via their characteristic
function, and it leads to test criteria that are convenient regarding
applications and consistent against arbitrary deviations from the model under
test. We emphasize goodness-of-fit tests for spherical distributions due to
their importance in applications and the relative scarcity of available
methods.Comment: 29 pages, 2 figures, 6 table
- …