Maximum Mean Discrepancy Meets Neural Networks: The Radon-Kolmogorov-Smirnov Test

Abstract

Maximum mean discrepancy (MMD) refers to a general class of nonparametric two-sample tests that are based on maximizing the mean difference over samples from one distribution PP versus another QQ, over all choices of data transformations ff living in some function space F\mathcal{F}. Inspired by recent work that connects what are known as functions of Radon bounded variation\textit{Radon bounded variation} (RBV) and neural networks (Parhi and Nowak, 2021, 2023), we study the MMD defined by taking F\mathcal{F} to be the unit ball in the RBV space of a given smoothness order k≥0k \geq 0. This test, which we refer to as the Radon-Kolmogorov-Smirnov\textit{Radon-Kolmogorov-Smirnov} (RKS) test, can be viewed as a generalization of the well-known and classical Kolmogorov-Smirnov (KS) test to multiple dimensions and higher orders of smoothness. It is also intimately connected to neural networks: we prove that the witness in the RKS test -- the function ff achieving the maximum mean difference -- is always a ridge spline of degree kk, i.e., a single neuron in a neural network. This allows us to leverage the power of modern deep learning toolkits to (approximately) optimize the criterion that underlies the RKS test. We prove that the RKS test has asymptotically full power at distinguishing any distinct pair P=̀¸QP \not= Q of distributions, derive its asymptotic null distribution, and carry out extensive experiments to elucidate the strengths and weakenesses of the RKS test versus the more traditional kernel MMD test

    Similar works

    Full text

    thumbnail-image

    Available Versions