447,360 research outputs found

    Sequences of regressions and their independences

    Full text link
    Ordered sequences of univariate or multivariate regressions provide statistical models for analysing data from randomized, possibly sequential interventions, from cohort or multi-wave panel studies, but also from cross-sectional or retrospective studies. Conditional independences are captured by what we name regression graphs, provided the generated distribution shares some properties with a joint Gaussian distribution. Regression graphs extend purely directed, acyclic graphs by two types of undirected graph, one type for components of joint responses and the other for components of the context vector variable. We review the special features and the history of regression graphs, derive criteria to read all implied independences of a regression graph and prove criteria for Markov equivalence that is to judge whether two different graphs imply the same set of independence statements. Knowledge of Markov equivalence provides alternative interpretations of a given sequence of regressions, is essential for machine learning strategies and permits to use the simple graphical criteria of regression graphs on graphs for which the corresponding criteria are in general more complex. Under the known conditions that a Markov equivalent directed acyclic graph exists for any given regression graph, we give a polynomial time algorithm to find one such graph.Comment: 43 pages with 17 figures The manuscript is to appear as an invited discussion paper in the journal TES

    Causal Discovery with Continuous Additive Noise Models

    Get PDF
    We consider the problem of learning causal directed acyclic graphs from an observational joint distribution. One can use these graphs to predict the outcome of interventional experiments, from which data are often not available. We show that if the observational distribution follows a structural equation model with an additive noise structure, the directed acyclic graph becomes identifiable from the distribution under mild conditions. This constitutes an interesting alternative to traditional methods that assume faithfulness and identify only the Markov equivalence class of the graph, thus leaving some edges undirected. We provide practical algorithms for finitely many samples, RESIT (Regression with Subsequent Independence Test) and two methods based on an independence score. We prove that RESIT is correct in the population setting and provide an empirical evaluation

    Cluster-Robust Variance Estimation for Dyadic Data

    Get PDF
    Dyadic data are common in the social sciences, although inference for such settings involves accounting for a complex clustering structure. Many analyses in the social sciences fail to account for the fact that multiple dyads share a member, and that errors are thus likely correlated across these dyads. We propose a nonparametric sandwich-type robust variance estimator for linear regression to account for such clustering in dyadic data. We enumerate conditions for estimator consistency. We also extend our results to repeated and weighted observations, including directed dyads and longitudinal data, and provide an implementation for generalized linear models such as logistic regression. We examine empirical performance with simulations and applications to international relations and speed dating

    Marginal integration for nonparametric causal inference

    Full text link
    We consider the problem of inferring the total causal effect of a single variable intervention on a (response) variable of interest. We propose a certain marginal integration regression technique for a very general class of potentially nonlinear structural equation models (SEMs) with known structure, or at least known superset of adjustment variables: we call the procedure S-mint regression. We easily derive that it achieves the convergence rate as for nonparametric regression: for example, single variable intervention effects can be estimated with convergence rate n−2/5n^{-2/5} assuming smoothness with twice differentiable functions. Our result can also be seen as a major robustness property with respect to model misspecification which goes much beyond the notion of double robustness. Furthermore, when the structure of the SEM is not known, we can estimate (the equivalence class of) the directed acyclic graph corresponding to the SEM, and then proceed by using S-mint based on these estimates. We empirically compare the S-mint regression method with more classical approaches and argue that the former is indeed more robust, more reliable and substantially simpler.Comment: 40 pages, 14 figure
    • …
    corecore