37,858 research outputs found
A Kernel Test for Three-Variable Interactions
We introduce kernel nonparametric tests for Lancaster three-variable
interaction and for total independence, using embeddings of signed measures
into a reproducing kernel Hilbert space. The resulting test statistics are
straightforward to compute, and are used in powerful interaction tests, which
are consistent against all alternatives for a large family of reproducing
kernels. We show the Lancaster test to be sensitive to cases where two
independent causes individually have weak influence on a third dependent
variable, but their combined effect has a strong influence. This makes the
Lancaster test especially suited to finding structure in directed graphical
models, where it outperforms competing nonparametric tests in detecting such
V-structures
Nonpar MANOVA via Independence Testing
The -sample testing problem tests whether or not groups of data points
are sampled from the same distribution. Multivariate analysis of variance
(MANOVA) is currently the gold standard for -sample testing but makes
strong, often inappropriate, parametric assumptions. Moreover, independence
testing and -sample testing are tightly related, and there are many
nonparametric multivariate independence tests with strong theoretical and
empirical properties, including distance correlation (Dcorr) and
Hilbert-Schmidt-Independence-Criterion (Hsic). We prove that universally
consistent independence tests achieve universally consistent -sample testing
and that -sample statistics like Energy and Maximum Mean Discrepancy (MMD)
are exactly equivalent to Dcorr. Empirically evaluating these tests for
-sample scenarios demonstrates that these nonparametric independence tests
typically outperform MANOVA, even for Gaussian distributed settings. Finally,
we extend these non-parametric -sample testing procedures to perform
multiway and multilevel tests. Thus, we illustrate the existence of many
theoretically motivated and empirically performant -sample tests. A Python
package with all independence and k-sample tests called hyppo is available from
https://hyppo.neurodata.io/.Comment: 15 pages main + 4 pages appendix, 9 figure
Testing Distributional Inequalities and Asymptotic Bias
When Barret and Donald (2003) in Econometrica proposed a consistent test of stochastic dominance, they were silent about the asymptotic unbiasedness of their tests against √n-converging Pitman local alternatives. This paper shows that when we focus on first-order stochastic dominance, there exists a wide class of √n-converging Pitman local alternatives against which their test is asymptotically biased, i.e., having the local asymptotic power strictly below the asymptotic size. This phenomenon more generally applies to one-sided nonparametric tests which have a sup norm of a shifted standard Brownian bridge as their limit under √n-converging Pitman local alternatives. Among other examples are tests of independence or conditional independence. We provide an intuitive explanation behind this phenomenon, and illustrate the implications using the simulation studies.Asymptotic Bias, One-sided Tests, Stochastic Dominance, Conditional Independence, Pitman Local Alternatives, Brownian Bridge Processes
On Azadkia-Chatterjee's conditional dependence coefficient
In recent work, Azadkia and Chatterjee (2021) laid out an ingenious approach
to defining consistent measures of conditional dependence. Their fully
nonparametric approach forms statistics based on ranks and nearest neighbor
graphs. The appealing nonparametric consistency of the resulting conditional
dependence measure and the associated empirical conditional dependence
coefficient has quickly prompted follow-up work that seeks to study its
statistical efficiency. In this paper, we take up the framework of conditional
randomization tests (CRT) for conditional independence and conduct a power
analysis that considers two types of local alternatives, namely, parametric
quadratic mean differentiable alternatives and nonparametric H\"older smooth
alternatives. Our local power analysis shows that conditional independence
tests using the Azadkia--Chatterjee coefficient remain inefficient even when
aided with the CRT framework, and serves as motivation to develop variants of
the approach; cf. Lin and Han (2022b). As a byproduct, we resolve a conjecture
of Azadkia and Chatterjee by proving central limit theorems for the considered
conditional dependence coefficients, with explicit formulas for the asymptotic
variances.Comment: to appear in Bernoull
On nonparametric and semiparametric testing for multivariate linear time series
We formulate nonparametric and semiparametric hypothesis testing of
multivariate stationary linear time series in a unified fashion and propose new
test statistics based on estimators of the spectral density matrix. The
limiting distributions of these test statistics under null hypotheses are
always normal distributions, and they can be implemented easily for practical
use. If null hypotheses are false, as the sample size goes to infinity, they
diverge to infinity and consequently are consistent tests for any alternative.
The approach can be applied to various null hypotheses such as the independence
between the component series, the equality of the autocovariance functions or
the autocorrelation functions of the component series, the separability of the
covariance matrix function and the time reversibility. Furthermore, a null
hypothesis with a nonlinear constraint like the conditional independence
between the two series can be tested in the same way.Comment: Published in at http://dx.doi.org/10.1214/08-AOS610 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
A Double Regression Method for Graphical Modeling of High-dimensional Nonlinear and Non-Gaussian Data
Graphical models have long been studied in statistics as a tool for inferring
conditional independence relationships among a large set of random variables.
The most existing works in graphical modeling focus on the cases that the data
are Gaussian or mixed and the variables are linearly dependent. In this paper,
we propose a double regression method for learning graphical models under the
high-dimensional nonlinear and non-Gaussian setting, and prove that the
proposed method is consistent under mild conditions. The proposed method works
by performing a series of nonparametric conditional independence tests. The
conditioning set of each test is reduced via a double regression procedure
where a model-free sure independence screening procedure or a sparse deep
neural network can be employed. The numerical results indicate that the
proposed method works well for high-dimensional nonlinear and non-Gaussian
data.Comment: 1 figur
A consistent nonparametric bootstrap test of exogeneity
This paper proposes a novel way of testing exogeneity of an explanatory variable without any parametric assumptions in the presence of a "conditional" instrumental variable. A testable implication is derived that if an explanatory variable is endogenous, the conditional distribution of the outcome given the endogenous variable is not independent of its instrumental variable(s). The test rejects the null hypothesis with probability one if the explanatory variable is endogenous and it detects alternatives converging to the null at a rate n^{-1/2}. We propose a consistent nonparametric bootstrap test to implement this testable implication. We show that the proposed bootstrap test can be asymptotically justified in the sense that it produces asymptotically correct size under the null of exogeneity, and it has unit power asymptotically. Our nonparametric test can be applied to the cases in which the outcome is generated by an additively non-separable structural relation or in which the outcome is discrete, which has not been studied in the literature.Postprin
Testing serial independence using the sample distribution function
This paper presents and discusses a nonparametric test for detecting serial dependence. We consider a Cramèr-v.Mises statistic based on the difference between the joint sample distribution and the product of the marginals. Exact critical values can be approximated from the asymptotic null distribution or by resampling, randomly permuting the original series. The approximation based on resampling is more accurate and the corresponding test enjoys, like other bootstrap based procedures, excellent level accuracy, with level error of order T-3/2. A Monte Carlo experiment illustrates the test performance with small and moderate sample sizes. The paper also includes an application, testing the random walk hypothesis of exchange rate returns for several currencies
- …