Search CORE

123,929 research outputs found

Exact testing with random permutations

Author: Goeman Jelle
Hemerik Jesse
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 30/11/2017
Field of study

When permutation methods are used in practice, often a limited number of random permutations are used to decrease the computational burden. However, most theoretical literature assumes that the whole permutation group is used, and methods based on random permutations tend to be seen as approximate. There exists a very limited amount of literature on exact testing with random permutations and only recently a thorough proof of exactness was given. In this paper we provide an alternative proof, viewing the test as a "conditional Monte Carlo test" as it has been called in the literature. We also provide extensions of the result. Importantly, our results can be used to prove properties of various multiple testing procedures based on random permutations

arXiv.org e-Print Archive

Crossref

Leiden University Scholary Publications

The WTO Trade Effect

Author: Myoung-Jae Lee
Pao-Li Chang
Publication venue
Publication date
Field of study

This paper reexamines the GATT/WTO membership effect on bilateral trade flows, using nonparametric methods including pair-matching, permutation tests, and a Rosenbaum (2002) sensitivity analysis. Together, these methods provide an estimation framework that is robust to misspecification biases, allows general forms of heterogeneous treatment effects, and addresses potential hidden selection biases. This is in contrast to most conventional parametric studies on this issue. Our results suggest large GATT/WTO trade-promoting e®ects, robust to various restricted matching criteria, alternative indicators for GATT/WTO involvement, different matching methodologies, non-random incidence of positive trade flows, and inclusion of multilateral resistance terms.Trade flow,Treatment effect,Matching,Permutation test,Signed-rank test,Sensitivity analysis

Research Papers in Economics

The conditional permutation test for independence while controlling for confounders

Author: Athey
Barber
Belloni
Candès
Cover
Dawid
Doran
Ernst
Fukumizu
Gretton
Hennessy
Kojadinovic
Pfister
Rosenbaum
Runge
Sen
Song
Stigler
Strobl
Su
Su
Su
Székely
Székely
Veraverbeke
Weihs
Zhang
Publication venue
Publication date: 07/05/2019
Field of study

We propose a general new method, the conditional permutation test, for testing the conditional independence of variables

X

and

Y

given a potentially high-dimensional random vector

Z

that may contain confounding factors. The proposed test permutes entries of

X

non-uniformly, so as to respect the existing dependence between

X

and

Z

and thus account for the presence of these confounders. Like the conditional randomization test of Cand\`es et al. (2018), our test relies on the availability of an approximation to the distribution of

X \mid Z

. While Cand\`es et al. (2018)'s test uses this estimate to draw new

X

values, for our test we use this approximation to design an appropriate non-uniform distribution on permutations of the

X

values already seen in the true data. We provide an efficient Markov Chain Monte Carlo sampler for the implementation of our method, and establish bounds on the Type I error in terms of the error in the approximation of the conditional distribution of

X\mid Z

, finding that, for the worst case test statistic, the inflation in Type I error of the conditional permutation test is no larger than that of the conditional randomization test. We validate these theoretical results with experiments on simulated data and on the Capital Bikeshare data set.Comment: 31 pages, 4 figure

arXiv.org e-Print Archive

Crossref

Warwick Research Archives Portal Repository

Conditional independence testing based on a nearest-neighbor estimator of conditional mutual information

Author: Runge Jakob
Publication venue
Publication date: 05/09/2017
Field of study

Conditional independence testing is a fundamental problem underlying causal discovery and a particularly challenging task in the presence of nonlinear and high-dimensional dependencies. Here a fully non-parametric test for continuous data based on conditional mutual information combined with a local permutation scheme is presented. Through a nearest neighbor approach, the test efficiently adapts also to non-smooth distributions due to strongly nonlinear dependencies. Numerical experiments demonstrate that the test reliably simulates the null distribution even for small sample sizes and with high-dimensional conditioning sets. The test is better calibrated than kernel-based tests utilizing an analytical approximation of the null distribution, especially for non-smooth densities, and reaches the same or higher power levels. Combining the local permutation scheme with the kernel tests leads to better calibration, but suffers in power. For smaller sample sizes and lower dimensions, the test is faster than random fourier feature-based kernel tests if the permutation scheme is (embarrassingly) parallelized, but the runtime increases more sharply with sample size and dimensionality. Thus, more theoretical research to analytically approximate the null distribution and speed up the estimation for larger sample sizes is desirable.Comment: 17 pages, 12 figures, 1 tabl

arXiv.org e-Print Archive

Institute of Transport Research:Publications

Independence Testing for Multivariate Time Series

Author: Chung Jaewon
Mehta Ronak
Shen Cencheng
Vogelstein Joshua T.
Xu Ting
Publication venue
Publication date: 14/05/2020
Field of study

Complex data structures such as time series are increasingly present in modern data science problems. A fundamental question is whether two such time-series are statistically dependent. Many current approaches make parametric assumptions on the random processes, only detect linear association, require multiple tests, or forfeit power in high-dimensional, nonlinear settings. Estimating the distribution of any test statistic under the null is non-trivial, as the permutation test is invalid. This work juxtaposes distance correlation (Dcorr) and multiscale graph correlation (MGC) from independence testing literature and block permutation from time series analysis to address these challenges. The proposed nonparametric procedure is valid and consistent, building upon prior work by characterizing the geometry of the relationship, estimating the time lag at which dependence is maximized, avoiding the need for multiple testing, and exhibiting superior power in high-dimensional, low sample size, nonlinear settings. Neural connectivity is analyzed via fMRI data, revealing linear dependence of signals within the visual network and default mode network, and nonlinear relationships in other networks. This work uncovers a first-resort data analysis tool with open-source code available, directly impacting a wide range of scientific disciplines.Comment: 21 pages, 6 figure

arXiv.org e-Print Archive