6,592 research outputs found

    Marginal integration for nonparametric causal inference

    Full text link
    We consider the problem of inferring the total causal effect of a single variable intervention on a (response) variable of interest. We propose a certain marginal integration regression technique for a very general class of potentially nonlinear structural equation models (SEMs) with known structure, or at least known superset of adjustment variables: we call the procedure S-mint regression. We easily derive that it achieves the convergence rate as for nonparametric regression: for example, single variable intervention effects can be estimated with convergence rate n2/5n^{-2/5} assuming smoothness with twice differentiable functions. Our result can also be seen as a major robustness property with respect to model misspecification which goes much beyond the notion of double robustness. Furthermore, when the structure of the SEM is not known, we can estimate (the equivalence class of) the directed acyclic graph corresponding to the SEM, and then proceed by using S-mint based on these estimates. We empirically compare the S-mint regression method with more classical approaches and argue that the former is indeed more robust, more reliable and substantially simpler.Comment: 40 pages, 14 figure

    A unified wavelet-based modelling framework for non-linear system identification: the WANARX model structure

    Get PDF
    A new unified modelling framework based on the superposition of additive submodels, functional components, and wavelet decompositions is proposed for non-linear system identification. A non-linear model, which is often represented using a multivariate non-linear function, is initially decomposed into a number of functional components via the wellknown analysis of variance (ANOVA) expression, which can be viewed as a special form of the NARX (non-linear autoregressive with exogenous inputs) model for representing dynamic input–output systems. By expanding each functional component using wavelet decompositions including the regular lattice frame decomposition, wavelet series and multiresolution wavelet decompositions, the multivariate non-linear model can then be converted into a linear-in-theparameters problem, which can be solved using least-squares type methods. An efficient model structure determination approach based upon a forward orthogonal least squares (OLS) algorithm, which involves a stepwise orthogonalization of the regressors and a forward selection of the relevant model terms based on the error reduction ratio (ERR), is employed to solve the linear-in-the-parameters problem in the present study. The new modelling structure is referred to as a wavelet-based ANOVA decomposition of the NARX model or simply WANARX model, and can be applied to represent high-order and high dimensional non-linear systems

    Invariant Causal Prediction for Nonlinear Models

    Full text link
    An important problem in many domains is to predict how a system will respond to interventions. This task is inherently linked to estimating the system's underlying causal structure. To this end, Invariant Causal Prediction (ICP) (Peters et al., 2016) has been proposed which learns a causal model exploiting the invariance of causal relations using data from different environments. When considering linear models, the implementation of ICP is relatively straightforward. However, the nonlinear case is more challenging due to the difficulty of performing nonparametric tests for conditional independence. In this work, we present and evaluate an array of methods for nonlinear and nonparametric versions of ICP for learning the causal parents of given target variables. We find that an approach which first fits a nonlinear model with data pooled over all environments and then tests for differences between the residual distributions across environments is quite robust across a large variety of simulation settings. We call this procedure "invariant residual distribution test". In general, we observe that the performance of all approaches is critically dependent on the true (unknown) causal structure and it becomes challenging to achieve high power if the parental set includes more than two variables. As a real-world example, we consider fertility rate modelling which is central to world population projections. We explore predicting the effect of hypothetical interventions using the accepted models from nonlinear ICP. The results reaffirm the previously observed central causal role of child mortality rates

    Generalised additive multiscale wavelet models constructed using particle swarm optimisation and mutual information for spatio-temporal evolutionary system representation

    Get PDF
    A new class of generalised additive multiscale wavelet models (GAMWMs) is introduced for high dimensional spatio-temporal evolutionary (STE) system identification. A novel two-stage hybrid learning scheme is developed for constructing such an additive wavelet model. In the first stage, a new orthogonal projection pursuit (OPP) method, implemented using a particle swarm optimisation(PSO) algorithm, is proposed for successively augmenting an initial coarse wavelet model, where relevant parameters of the associated wavelets are optimised using a particle swarm optimiser. The resultant network model, obtained in the first stage, may however be a redundant model. In the second stage, a forward orthogonal regression (FOR) algorithm, implemented using a mutual information method, is then applied to refine and improve the initially constructed wavelet model. The proposed two-stage hybrid method can generally produce a parsimonious wavelet model, where a ranked list of wavelet functions, according to the capability of each wavelet to represent the total variance in the desired system output signal is produced. The proposed new modelling framework is applied to real observed images, relative to a chemical reaction exhibiting a spatio-temporal evolutionary behaviour, and the associated identification results show that the new modelling framework is applicable and effective for handling high dimensional identification problems of spatio-temporal evolution sytems

    A Bayesian semiparametric latent variable model for mixed responses

    Get PDF
    In this article we introduce a latent variable model (LVM) for mixed ordinal and continuous responses, where covariate effects on the continuous latent variables are modelled through a flexible semiparametric predictor. We extend existing LVM with simple linear covariate effects by including nonparametric components for nonlinear effects of continuous covariates and interactions with other covariates as well as spatial effects. Full Bayesian modelling is based on penalized spline and Markov random field priors and is performed by computationally efficient Markov chain Monte Carlo (MCMC) methods. We apply our approach to a large German social science survey which motivated our methodological development

    An adaptive orthogonal search algorithm for model subset selection and non-linear system identification

    Get PDF
    A new adaptive orthogonal search (AOS) algorithm is proposed for model subset selection and non-linear system identification. Model structure detection is a key step in any system identification problem. This consists of selecting significant model terms from a redundant dictionary of candidate model terms, and determining the model complexity (model length or model size). The final objective is to produce a parsimonious model that can well capture the inherent dynamics of the underlying system. In the new AOS algorithm, a modified generalized cross-validation criterion, called the adjustable prediction error sum of squares (APRESS), is introduced and incorporated into a forward orthogonal search procedure. The main advantage of the new AOS algorithm is that the mechanism is simple and the implementation is direct and easy, and more importantly it can produce efficient model subsets for most non-linear identification problems

    Bayesian Approximate Kernel Regression with Variable Selection

    Full text link
    Nonlinear kernel regression models are often used in statistics and machine learning because they are more accurate than linear models. Variable selection for kernel regression models is a challenge partly because, unlike the linear regression setting, there is no clear concept of an effect size for regression coefficients. In this paper, we propose a novel framework that provides an effect size analog of each explanatory variable for Bayesian kernel regression models when the kernel is shift-invariant --- for example, the Gaussian kernel. We use function analytic properties of shift-invariant reproducing kernel Hilbert spaces (RKHS) to define a linear vector space that: (i) captures nonlinear structure, and (ii) can be projected onto the original explanatory variables. The projection onto the original explanatory variables serves as an analog of effect sizes. The specific function analytic property we use is that shift-invariant kernel functions can be approximated via random Fourier bases. Based on the random Fourier expansion we propose a computationally efficient class of Bayesian approximate kernel regression (BAKR) models for both nonlinear regression and binary classification for which one can compute an analog of effect sizes. We illustrate the utility of BAKR by examining two important problems in statistical genetics: genomic selection (i.e. phenotypic prediction) and association mapping (i.e. inference of significant variants or loci). State-of-the-art methods for genomic selection and association mapping are based on kernel regression and linear models, respectively. BAKR is the first method that is competitive in both settings.Comment: 22 pages, 3 figures, 3 tables; theory added; new simulations presented; references adde

    Semiparametric Bayesian inference in multiple equation models

    Get PDF
    This paper outlines an approach to Bayesian semiparametric regression in multiple equation models which can be used to carry out inference in seemingly unrelated regressions or simultaneous equations models with nonparametric components. The approach treats the points on each nonparametric regression line as unknown parameters and uses a prior on the degree of smoothness of each line to ensure valid posterior inference despite the fact that the number of parameters is greater than the number of observations. We develop an empirical Bayesian approach that allows us to estimate the prior smoothing hyperparameters from the data. An advantage of our semiparametric model is that it is written as a seemingly unrelated regressions model with independent normal-Wishart prior. Since this model is a common one, textbook results for posterior inference, model comparison, prediction and posterior computation are immediately available. We use this model in an application involving a two-equation structural model drawn from the labour and returns to schooling literatures

    Functional Regression

    Full text link
    Functional data analysis (FDA) involves the analysis of data whose ideal units of observation are functions defined on some continuous domain, and the observed data consist of a sample of functions taken from some population, sampled on a discrete grid. Ramsay and Silverman's 1997 textbook sparked the development of this field, which has accelerated in the past 10 years to become one of the fastest growing areas of statistics, fueled by the growing number of applications yielding this type of data. One unique characteristic of FDA is the need to combine information both across and within functions, which Ramsay and Silverman called replication and regularization, respectively. This article will focus on functional regression, the area of FDA that has received the most attention in applications and methodological development. First will be an introduction to basis functions, key building blocks for regularization in functional regression methods, followed by an overview of functional regression methods, split into three types: [1] functional predictor regression (scalar-on-function), [2] functional response regression (function-on-scalar) and [3] function-on-function regression. For each, the role of replication and regularization will be discussed and the methodological development described in a roughly chronological manner, at times deviating from the historical timeline to group together similar methods. The primary focus is on modeling and methodology, highlighting the modeling structures that have been developed and the various regularization approaches employed. At the end is a brief discussion describing potential areas of future development in this field
    corecore