Search CORE

43,099 research outputs found

Large-Scale Kernel Methods for Independence Testing

Author: Filippi Sarah
Gretton Arthur
Sejdinovic Dino
Zhang Qinyi
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 25/06/2016
Field of study

Representations of probability measures in reproducing kernel Hilbert spaces provide a flexible framework for fully nonparametric hypothesis tests of independence, which can capture any type of departure from independence, including nonlinear associations and multivariate interactions. However, these approaches come with an at least quadratic computational cost in the number of observations, which can be prohibitive in many applications. Arguably, it is exactly in such large-scale datasets that capturing any type of dependence is of interest, so striking a favourable tradeoff between computational efficiency and test performance for kernel independence tests would have a direct impact on their applicability in practice. In this contribution, we provide an extensive study of the use of large-scale kernel approximations in the context of independence testing, contrasting block-based, Nystrom and random Fourier feature approaches. Through a variety of synthetic data experiments, it is demonstrated that our novel large scale methods give comparable performance with existing methods whilst using significantly less computation time and memory.Comment: 29 pages, 6 figure

arXiv.org e-Print Archive

Springer - Publisher Connector

UCL Discovery

Oxford University Research Archive

Spiral - Imperial College Digital Repository

Large-scale kernel methods for independence testing

Author: A Berlinet
A Gretton
A Zaremba
Arthur Gretton
B Schölkopf
C Cortes
D Lopez-Paz
D Sejdinovic
D Sejdinovic
Dino Sejdinovic
F Bach
GJ Székely
GJ Székely
H Wendland
I Steinwart
J Dauxois
J Peters
J Zhao
K Mardia
KM Borgwardt
M Blaschko
M Reed
MA Arcones
N Aronszajn
NH Anderson
P Lai
Qinyi Zhang
R Lyons
RJ Serfling
Sarah Filippi
SR Flaxman
SY Huang
Y Cho
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Learning new physics efficiently with nonparametric methods

Author: Grosso Gaia
Letizia Marco
Losapio Gianvito
Pierini Maurizio
Rando Marco
Rosasco Lorenzo
Wulzer Andrea
Zanetti Marco
Publication venue
Publication date: 01/01/2022
Field of study

We present a machine learning approach for model-independent new physics searches. The corresponding algorithm is powered by recent large-scale implementations of kernel methods, nonparametric learning algorithms that can approximate any continuous function given enough data. Based on the original proposal by D'Agnolo and Wulzer (arXiv:1806.02350), the model evaluates the compatibility between experimental data and a reference model, by implementing a hypothesis testing procedure based on the likelihood ratio. Model-independence is enforced by avoiding any prior assumption about the presence or shape of new physics components in the measurements. We show that our approach has dramatic advantages compared to neural network implementations in terms of training times and computational resources, while maintaining comparable performances. In particular, we conduct our tests on higher dimensional datasets, a step forward with respect to previous studies.Comment: 22 pages, 13 figure

arXiv.org e-Print Archive

DSpace@MIT

EDP Sciences OAI-PMH repository (1.2.0)

PubMed Central

CERN Document Server

Kernel-based Conditional Independence Test and Application in Causal Discovery

Author: Janzing Dominik
Peters Jonas
Schoelkopf Bernhard
Zhang Kun
Publication venue
Publication date: 01/01/2011
Field of study

Conditional independence testing is an important problem, especially in Bayesian network learning and causal discovery. Due to the curse of dimensionality, testing for conditional independence of continuous variables is particularly challenging. We propose a Kernel-based Conditional Independence test (KCI-test), by constructing an appropriate test statistic and deriving its asymptotic distribution under the null hypothesis of conditional independence. The proposed method is computationally efficient and easy to implement. Experimental results show that it outperforms other methods, especially when the conditioning set is large or the sample size is not very large, in which case other methods encounter difficulties

arXiv.org e-Print Archive

MPG.PuRe

Conditional independence testing based on a nearest-neighbor estimator of conditional mutual information

Author: Runge Jakob
Publication venue
Publication date: 05/09/2017
Field of study

Conditional independence testing is a fundamental problem underlying causal discovery and a particularly challenging task in the presence of nonlinear and high-dimensional dependencies. Here a fully non-parametric test for continuous data based on conditional mutual information combined with a local permutation scheme is presented. Through a nearest neighbor approach, the test efficiently adapts also to non-smooth distributions due to strongly nonlinear dependencies. Numerical experiments demonstrate that the test reliably simulates the null distribution even for small sample sizes and with high-dimensional conditioning sets. The test is better calibrated than kernel-based tests utilizing an analytical approximation of the null distribution, especially for non-smooth densities, and reaches the same or higher power levels. Combining the local permutation scheme with the kernel tests leads to better calibration, but suffers in power. For smaller sample sizes and lower dimensions, the test is faster than random fourier feature-based kernel tests if the permutation scheme is (embarrassingly) parallelized, but the runtime increases more sharply with sample size and dimensionality. Thus, more theoretical research to analytically approximate the null distribution and speed up the estimation for larger sample sizes is desirable.Comment: 17 pages, 12 figures, 1 tabl

arXiv.org e-Print Archive

Institute of Transport Research:Publications