74 research outputs found

    Generalized structured additive regression based on Bayesian P-splines

    Get PDF
    Generalized additive models (GAM) for modelling nonlinear effects of continuous covariates are now well established tools for the applied statistician. In this paper we develop Bayesian GAM's and extensions to generalized structured additive regression based on one or two dimensional P-splines as the main building block. The approach extends previous work by Lang und Brezger (2003) for Gaussian responses. Inference relies on Markov chain Monte Carlo (MCMC) simulation techniques, and is either based on iteratively weighted least squares (IWLS) proposals or on latent utility representations of (multi)categorical regression models. Our approach covers the most common univariate response distributions, e.g. the Binomial, Poisson or Gamma distribution, as well as multicategorical responses. For the first time, we present Bayesian semiparametric inference for the widely used multinomial logit models. As we will demonstrate through two applications on the forest health status of trees and a space-time analysis of health insurance data, the approach allows realistic modelling of complex problems. We consider the enormous flexibility and extendability of our approach as a main advantage of Bayesian inference based on MCMC techniques compared to more traditional approaches. Software for the methodology presented in the paper is provided within the public domain package BayesX

    The Dynamics of Self-employment in a Developing Country: Evidence from India

    Get PDF
    We examine the spatio-temporal dynamics of self-employment in India using geoadditive models and pseudo panel techniques. We test the claim of Iyigun and Owen (1999) that individuals invest in professional human capital and not in entrepreneurial human capital as an economy develops. The results suggest that in non-agriculture, higher education decreases the likelihood of individuals choosing self-employment over time; however, it has an opposite effect in agriculture. While increases in land possessed increase the likelihood of self-employment choice in agriculture, individuals with small land holdings are more likely to transition into self-employment in non-agriculture. Belonging to a backward class has a negative effect on self-employment choice in both sectors; however, the effect has increased in non-agriculture and remained stable in agriculture. The geoadditive models suggest that the propensity to be self-employed has decreased across most spatial units, although there are few pockets where self-employment is rising again.

    Penalized additive regression for space-time data: a Bayesian perspective

    Get PDF
    We propose extensions of penalized spline generalized additive models for analysing space-time regression data and study them from a Bayesian perspective. Non-linear effects of continuous covariates and time trends are modelled through Bayesian versions of penalized splines, while correlated spatial effects follow a Markov random field prior. This allows to treat all functions and effects within a unified general framework by assigning appropriate priors with different forms and degrees of smoothness. Inference can be performed either with full (FB) or empirical Bayes (EB) posterior analysis. FB inference using MCMC techniques is a slight extension of own previous work. For EB inference, a computationally efficient solution is developed on the basis of a generalized linear mixed model representation. The second approach can be viewed as posterior mode estimation and is closely related to penalized likelihood estimation in a frequentist setting. Variance components, corresponding to smoothing parameters, are then estimated by using marginal likelihood. We carefully compare both inferential procedures in simulation studies and illustrate them through real data applications. The methodology is available in the open domain statistical package BayesX and as an S-plus/R function

    Semiparametric Regression During 2003ā€“2007

    Get PDF
    Semiparametric regression is a fusion between parametric regression and nonparametric regression and the title of a book that we published on the topic in early 2003. We review developments in the field during the five year period since the book was written. We find semiparametric regression to be a vibrant field with substantial involvement and activity, continual enhancement and widespread application

    Variable Selection and Model Choice in Structured Survival Models

    Get PDF
    In many situations, medical applications ask for flexible survival models that allow to extend the classical Cox-model via the inclusion of time-varying and nonparametric effects. These structured survival models are very flexible but additional difficulties arise when model choice and variable selection is desired. In particular, it has to be decided which covariates should be assigned time-varying effects or whether parametric modeling is sufficient for a given covariate. Component-wise boosting provides a means of likelihood-based model fitting that enables simultaneous variable selection and model choice. We introduce a component-wise likelihood-based boosting algorithm for survival data that permits the inclusion of both parametric and nonparametric time-varying effects as well as nonparametric effects of continuous covariates utilizing penalized splines as the main modeling technique. Its properties and performance are investigated in simulation studies. The new modeling approach is used to build a flexible survival model for intensive care patients suffering from severe sepsis. A software implementation is available to the interested reader

    Quantifying Spatial Disparities in Neonatal Mortality Using a Structured Additive Regression Model

    Get PDF
    Background: Neonatal mortality contributes a large proportion towards early childhood mortality in developing countries, with considerable geographical variation at small areas within countries. Methods: A geo-additive logistic regression model is proposed for quantifying small-scale geographical variation in neonatal mortality, and to estimate risk factors of neonatal mortality. Random effects are introduced to capture spatial correlation and heterogeneity. The spatial correlation can be modelled using the Markov random fields (MRF) when data is aggregated, while the two dimensional P-splines apply when exact locations are available, whereas the unstructured spatial effects are assigned an independent Gaussian prior. Socio-economic and bio-demographic factors which may affect the risk of neonatal mortality are simultaneously estimated as fixed effects and as nonlinear effects for continuous covariates. The smooth effects of continuous covariates are modelled by second-order random walk priors. Modelling and inference use the empirical Bayesian approach via penalized likelihood technique. The methodology is applied to analyse the likelihood of neonatal deaths, using data from the 2000 Malawi demographic and health survey. The spatial effects are quantified through MRF and two dimensional P-splines priors. Results: Findings indicate that both fixed and spatial effects are associated with neonatal mortality. Conclusions: Our study, therefore, suggests that the challenge to reduce neonatal mortality goes beyond addressin

    Bayesian P-Splines in Structured Additive Regression Models

    Get PDF
    • ā€¦
    corecore