37,858 research outputs found

    A Kernel Test for Three-Variable Interactions

    Full text link
    We introduce kernel nonparametric tests for Lancaster three-variable interaction and for total independence, using embeddings of signed measures into a reproducing kernel Hilbert space. The resulting test statistics are straightforward to compute, and are used in powerful interaction tests, which are consistent against all alternatives for a large family of reproducing kernels. We show the Lancaster test to be sensitive to cases where two independent causes individually have weak influence on a third dependent variable, but their combined effect has a strong influence. This makes the Lancaster test especially suited to finding structure in directed graphical models, where it outperforms competing nonparametric tests in detecting such V-structures

    Nonpar MANOVA via Independence Testing

    Full text link
    The kk-sample testing problem tests whether or not kk groups of data points are sampled from the same distribution. Multivariate analysis of variance (MANOVA) is currently the gold standard for kk-sample testing but makes strong, often inappropriate, parametric assumptions. Moreover, independence testing and kk-sample testing are tightly related, and there are many nonparametric multivariate independence tests with strong theoretical and empirical properties, including distance correlation (Dcorr) and Hilbert-Schmidt-Independence-Criterion (Hsic). We prove that universally consistent independence tests achieve universally consistent kk-sample testing and that kk-sample statistics like Energy and Maximum Mean Discrepancy (MMD) are exactly equivalent to Dcorr. Empirically evaluating these tests for kk-sample scenarios demonstrates that these nonparametric independence tests typically outperform MANOVA, even for Gaussian distributed settings. Finally, we extend these non-parametric kk-sample testing procedures to perform multiway and multilevel tests. Thus, we illustrate the existence of many theoretically motivated and empirically performant kk-sample tests. A Python package with all independence and k-sample tests called hyppo is available from https://hyppo.neurodata.io/.Comment: 15 pages main + 4 pages appendix, 9 figure

    Testing Distributional Inequalities and Asymptotic Bias

    Get PDF
    When Barret and Donald (2003) in Econometrica proposed a consistent test of stochastic dominance, they were silent about the asymptotic unbiasedness of their tests against √n-converging Pitman local alternatives. This paper shows that when we focus on first-order stochastic dominance, there exists a wide class of √n-converging Pitman local alternatives against which their test is asymptotically biased, i.e., having the local asymptotic power strictly below the asymptotic size. This phenomenon more generally applies to one-sided nonparametric tests which have a sup norm of a shifted standard Brownian bridge as their limit under √n-converging Pitman local alternatives. Among other examples are tests of independence or conditional independence. We provide an intuitive explanation behind this phenomenon, and illustrate the implications using the simulation studies.Asymptotic Bias, One-sided Tests, Stochastic Dominance, Conditional Independence, Pitman Local Alternatives, Brownian Bridge Processes

    On Azadkia-Chatterjee's conditional dependence coefficient

    Full text link
    In recent work, Azadkia and Chatterjee (2021) laid out an ingenious approach to defining consistent measures of conditional dependence. Their fully nonparametric approach forms statistics based on ranks and nearest neighbor graphs. The appealing nonparametric consistency of the resulting conditional dependence measure and the associated empirical conditional dependence coefficient has quickly prompted follow-up work that seeks to study its statistical efficiency. In this paper, we take up the framework of conditional randomization tests (CRT) for conditional independence and conduct a power analysis that considers two types of local alternatives, namely, parametric quadratic mean differentiable alternatives and nonparametric H\"older smooth alternatives. Our local power analysis shows that conditional independence tests using the Azadkia--Chatterjee coefficient remain inefficient even when aided with the CRT framework, and serves as motivation to develop variants of the approach; cf. Lin and Han (2022b). As a byproduct, we resolve a conjecture of Azadkia and Chatterjee by proving central limit theorems for the considered conditional dependence coefficients, with explicit formulas for the asymptotic variances.Comment: to appear in Bernoull

    On nonparametric and semiparametric testing for multivariate linear time series

    Full text link
    We formulate nonparametric and semiparametric hypothesis testing of multivariate stationary linear time series in a unified fashion and propose new test statistics based on estimators of the spectral density matrix. The limiting distributions of these test statistics under null hypotheses are always normal distributions, and they can be implemented easily for practical use. If null hypotheses are false, as the sample size goes to infinity, they diverge to infinity and consequently are consistent tests for any alternative. The approach can be applied to various null hypotheses such as the independence between the component series, the equality of the autocovariance functions or the autocorrelation functions of the component series, the separability of the covariance matrix function and the time reversibility. Furthermore, a null hypothesis with a nonlinear constraint like the conditional independence between the two series can be tested in the same way.Comment: Published in at http://dx.doi.org/10.1214/08-AOS610 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    A Double Regression Method for Graphical Modeling of High-dimensional Nonlinear and Non-Gaussian Data

    Full text link
    Graphical models have long been studied in statistics as a tool for inferring conditional independence relationships among a large set of random variables. The most existing works in graphical modeling focus on the cases that the data are Gaussian or mixed and the variables are linearly dependent. In this paper, we propose a double regression method for learning graphical models under the high-dimensional nonlinear and non-Gaussian setting, and prove that the proposed method is consistent under mild conditions. The proposed method works by performing a series of nonparametric conditional independence tests. The conditioning set of each test is reduced via a double regression procedure where a model-free sure independence screening procedure or a sparse deep neural network can be employed. The numerical results indicate that the proposed method works well for high-dimensional nonlinear and non-Gaussian data.Comment: 1 figur

    A consistent nonparametric bootstrap test of exogeneity

    Get PDF
    This paper proposes a novel way of testing exogeneity of an explanatory variable without any parametric assumptions in the presence of a "conditional" instrumental variable. A testable implication is derived that if an explanatory variable is endogenous, the conditional distribution of the outcome given the endogenous variable is not independent of its instrumental variable(s). The test rejects the null hypothesis with probability one if the explanatory variable is endogenous and it detects alternatives converging to the null at a rate n^{-1/2}. We propose a consistent nonparametric bootstrap test to implement this testable implication. We show that the proposed bootstrap test can be asymptotically justified in the sense that it produces asymptotically correct size under the null of exogeneity, and it has unit power asymptotically. Our nonparametric test can be applied to the cases in which the outcome is generated by an additively non-separable structural relation or in which the outcome is discrete, which has not been studied in the literature.Postprin

    Testing serial independence using the sample distribution function

    Get PDF
    This paper presents and discusses a nonparametric test for detecting serial dependence. We consider a Cramèr-v.Mises statistic based on the difference between the joint sample distribution and the product of the marginals. Exact critical values can be approximated from the asymptotic null distribution or by resampling, randomly permuting the original series. The approximation based on resampling is more accurate and the corresponding test enjoys, like other bootstrap based procedures, excellent level accuracy, with level error of order T-3/2. A Monte Carlo experiment illustrates the test performance with small and moderate sample sizes. The paper also includes an application, testing the random walk hypothesis of exchange rate returns for several currencies
    • …
    corecore