17,868 research outputs found

    A consistent nonparametric bootstrap test of exogeneity

    Get PDF
    This paper proposes a novel way of testing exogeneity of an explanatory variable without any parametric assumptions in the presence of a "conditional" instrumental variable. A testable implication is derived that if an explanatory variable is endogenous, the conditional distribution of the outcome given the endogenous variable is not independent of its instrumental variable(s). The test rejects the null hypothesis with probability one if the explanatory variable is endogenous and it detects alternatives converging to the null at a rate n^{-1/2}. We propose a consistent nonparametric bootstrap test to implement this testable implication. We show that the proposed bootstrap test can be asymptotically justified in the sense that it produces asymptotically correct size under the null of exogeneity, and it has unit power asymptotically. Our nonparametric test can be applied to the cases in which the outcome is generated by an additively non-separable structural relation or in which the outcome is discrete, which has not been studied in the literature.Postprin

    On the power of conditional independence testing under model-X

    Full text link
    For testing conditional independence (CI) of a response Y and a predictor X given covariates Z, the recently introduced model-X (MX) framework has been the subject of active methodological research, especially in the context of MX knockoffs and their successful application to genome-wide association studies. In this paper, we study the power of MX CI tests, yielding quantitative explanations for empirically observed phenomena and novel insights to guide the design of MX methodology. We show that any valid MX CI test must also be valid conditionally on Y and Z; this conditioning allows us to reformulate the problem as testing a point null hypothesis involving the conditional distribution of X. The Neyman-Pearson lemma then implies that the conditional randomization test (CRT) based on a likelihood statistic is the most powerful MX CI test against a point alternative. We also obtain a related optimality result for MX knockoffs. Switching to an asymptotic framework with arbitrarily growing covariate dimension, we derive an expression for the limiting power of the CRT against local semiparametric alternatives in terms of the prediction error of the machine learning algorithm on which its test statistic is based. Finally, we exhibit a resampling-free test with uniform asymptotic Type-I error control under the assumption that only the first two moments of X given Z are known, a significant relaxation of the MX assumption

    Invariant Causal Prediction for Nonlinear Models

    Full text link
    An important problem in many domains is to predict how a system will respond to interventions. This task is inherently linked to estimating the system's underlying causal structure. To this end, Invariant Causal Prediction (ICP) (Peters et al., 2016) has been proposed which learns a causal model exploiting the invariance of causal relations using data from different environments. When considering linear models, the implementation of ICP is relatively straightforward. However, the nonlinear case is more challenging due to the difficulty of performing nonparametric tests for conditional independence. In this work, we present and evaluate an array of methods for nonlinear and nonparametric versions of ICP for learning the causal parents of given target variables. We find that an approach which first fits a nonlinear model with data pooled over all environments and then tests for differences between the residual distributions across environments is quite robust across a large variety of simulation settings. We call this procedure "invariant residual distribution test". In general, we observe that the performance of all approaches is critically dependent on the true (unknown) causal structure and it becomes challenging to achieve high power if the parental set includes more than two variables. As a real-world example, we consider fertility rate modelling which is central to world population projections. We explore predicting the effect of hypothetical interventions using the accepted models from nonlinear ICP. The results reaffirm the previously observed central causal role of child mortality rates

    Testing the Markov property with ultra-high frequency financial data

    Get PDF
    This paper develops a framework to nonparametrically test whether discretevalued irregularly-spaced financial transactions data follow a Markov process. For that purpose, we consider a specific optional sampling in which a continuous-time Markov process is observed only when it crosses some discrete level. This framework is convenient for it accommodates not only the irregular spacing of transactions data, but also price discreteness. Under such an observation rule, the current price duration is independent of previous price durations given the current price realization. A simple nonparametric test then follows by examining whether this conditional independence property holds. Finally, we investigate whether or not bid-ask spreads follow Markov processes using transactions data from the New York Stock Exchange. The motivation lies on the fact that asymmetric information models of market microstructures predict that the Markov property does not hold for the bid-ask spread. The results are mixed in the sense that the Markov assumption is rejected for three out of the five stocks we have analyzed.Bid-ask spread, nonparametric testing, price durations, Markov property, ultra-high frequency data

    Model Adequacy Checks for Discrete Choice Dynamic Models

    Full text link
    This paper proposes new parametric model adequacy tests for possibly nonlinear and nonstationary time series models with noncontinuous data distribution, which is often the case in applied work. In particular, we consider the correct specification of parametric conditional distributions in dynamic discrete choice models, not only of some particular conditional characteristics such as moments or symmetry. Knowing the true distribution is important in many circumstances, in particular to apply efficient maximum likelihood methods, obtain consistent estimates of partial effects and appropriate predictions of the probability of future events. We propose a transformation of data which under the true conditional distribution leads to continuous uniform iid series. The uniformity and serial independence of the new series is then examined simultaneously. The transformation can be considered as an extension of the integral transform tool for noncontinuous data. We derive asymptotic properties of such tests taking into account the parameter estimation effect. Since transformed series are iid we do not require any mixing conditions and asymptotic results illustrate the double simultaneous checking nature of our test. The test statistics converges under the null with a parametric rate to the asymptotic distribution, which is case dependent, hence we justify a parametric bootstrap approximation. The test has power against local alternatives and is consistent. The performance of the new tests is compared with classical specification checks for discrete choice models

    The conditional permutation test for independence while controlling for confounders

    Get PDF
    We propose a general new method, the conditional permutation test, for testing the conditional independence of variables XX and YY given a potentially high-dimensional random vector ZZ that may contain confounding factors. The proposed test permutes entries of XX non-uniformly, so as to respect the existing dependence between XX and ZZ and thus account for the presence of these confounders. Like the conditional randomization test of Cand\`es et al. (2018), our test relies on the availability of an approximation to the distribution of X∣ZX \mid Z. While Cand\`es et al. (2018)'s test uses this estimate to draw new XX values, for our test we use this approximation to design an appropriate non-uniform distribution on permutations of the XX values already seen in the true data. We provide an efficient Markov Chain Monte Carlo sampler for the implementation of our method, and establish bounds on the Type I error in terms of the error in the approximation of the conditional distribution of X∣ZX\mid Z, finding that, for the worst case test statistic, the inflation in Type I error of the conditional permutation test is no larger than that of the conditional randomization test. We validate these theoretical results with experiments on simulated data and on the Capital Bikeshare data set.Comment: 31 pages, 4 figure
    • …
    corecore