131,802 research outputs found

    BayesX - Software for Bayesian Inference based on Markov Chain Monte Carlo simulation techniques

    Get PDF
    BayesX is a Software tool for Bayesian inference based on Markov Chain Monte Carlo (MCMC) inference techniques. The main feature of BayesX so far, is a very powerful regression tool for Bayesian semiparametric regression within the Generalized linear models framework. BayesX is able to estimate nonlinear effects of metrical covariates, trends and flexible seasonal patterns of time scales, structured and/or unstructured random effects of spatial covariates (geographical data) and unstructured random effects of unordered group indicators. Moreover, BayesX is able to estimate varying coefficients models with metrical and even spatial covariates as effectmodifiers. The distribution of the response can be either Gaussian, binomial or Poisson. In addition, BayesX has some useful functions for handling and manipulating datasets and geographical maps

    Penalized likelihood estimation and iterative kalman smoothing for non-gaussian dynamic regression models

    Get PDF
    Dynamic regression or state space models provide a flexible framework for analyzing non-Gaussian time series and longitudinal data, covering for example models for discrete longitudinal observations. As for non-Gaussian random coefficient models, a direct Bayesian approach leads to numerical integration problems, often intractable for more complicated data sets. Recent Markov chain Monte Carlo methods avoid this by repeated sampling from approximative posterior distributions, but there are still open questions about sampling schemes and convergence. In this article we consider simpler methods of inference based on posterior modes or, equivalently, maximum penalized likelihood estimation. From the latter point of view, the approach can also be interpreted as a nonparametric method for smoothing time-varying coefficients. Efficient smoothing algorithms are obtained by iteration of common linear Kalman filtering and smoothing, in the same way as estimation in generalized linear models with fixed effects can be performed by iteratively weighted least squares estimation. The algorithm can be combined with an EM-type method or cross-validation to estimate unknown hyper- or smoothing parameters. The approach is illustrated by applications to a binary time series and a multicategorical longitudinal data set

    Functional Regression

    Full text link
    Functional data analysis (FDA) involves the analysis of data whose ideal units of observation are functions defined on some continuous domain, and the observed data consist of a sample of functions taken from some population, sampled on a discrete grid. Ramsay and Silverman's 1997 textbook sparked the development of this field, which has accelerated in the past 10 years to become one of the fastest growing areas of statistics, fueled by the growing number of applications yielding this type of data. One unique characteristic of FDA is the need to combine information both across and within functions, which Ramsay and Silverman called replication and regularization, respectively. This article will focus on functional regression, the area of FDA that has received the most attention in applications and methodological development. First will be an introduction to basis functions, key building blocks for regularization in functional regression methods, followed by an overview of functional regression methods, split into three types: [1] functional predictor regression (scalar-on-function), [2] functional response regression (function-on-scalar) and [3] function-on-function regression. For each, the role of replication and regularization will be discussed and the methodological development described in a roughly chronological manner, at times deviating from the historical timeline to group together similar methods. The primary focus is on modeling and methodology, highlighting the modeling structures that have been developed and the various regularization approaches employed. At the end is a brief discussion describing potential areas of future development in this field

    Spike-and-Slab Priors for Function Selection in Structured Additive Regression Models

    Full text link
    Structured additive regression provides a general framework for complex Gaussian and non-Gaussian regression models, with predictors comprising arbitrary combinations of nonlinear functions and surfaces, spatial effects, varying coefficients, random effects and further regression terms. The large flexibility of structured additive regression makes function selection a challenging and important task, aiming at (1) selecting the relevant covariates, (2) choosing an appropriate and parsimonious representation of the impact of covariates on the predictor and (3) determining the required interactions. We propose a spike-and-slab prior structure for function selection that allows to include or exclude single coefficients as well as blocks of coefficients representing specific model terms. A novel multiplicative parameter expansion is required to obtain good mixing and convergence properties in a Markov chain Monte Carlo simulation approach and is shown to induce desirable shrinkage properties. In simulation studies and with (real) benchmark classification data, we investigate sensitivity to hyperparameter settings and compare performance to competitors. The flexibility and applicability of our approach are demonstrated in an additive piecewise exponential model with time-varying effects for right-censored survival times of intensive care patients with sepsis. Geoadditive and additive mixed logit model applications are discussed in an extensive appendix

    Generalized structured additive regression based on Bayesian P-splines

    Get PDF
    Generalized additive models (GAM) for modelling nonlinear effects of continuous covariates are now well established tools for the applied statistician. In this paper we develop Bayesian GAM's and extensions to generalized structured additive regression based on one or two dimensional P-splines as the main building block. The approach extends previous work by Lang und Brezger (2003) for Gaussian responses. Inference relies on Markov chain Monte Carlo (MCMC) simulation techniques, and is either based on iteratively weighted least squares (IWLS) proposals or on latent utility representations of (multi)categorical regression models. Our approach covers the most common univariate response distributions, e.g. the Binomial, Poisson or Gamma distribution, as well as multicategorical responses. For the first time, we present Bayesian semiparametric inference for the widely used multinomial logit models. As we will demonstrate through two applications on the forest health status of trees and a space-time analysis of health insurance data, the approach allows realistic modelling of complex problems. We consider the enormous flexibility and extendability of our approach as a main advantage of Bayesian inference based on MCMC techniques compared to more traditional approaches. Software for the methodology presented in the paper is provided within the public domain package BayesX

    A mixed model approach for structured hazard regression

    Get PDF
    The classical Cox proportional hazards model is a benchmark approach to analyze continuous survival times in the presence of covariate information. In a number of applications, there is a need to relax one or more of its inherent assumptions, such as linearity of the predictor or the proportional hazards property. Also, one is often interested in jointly estimating the baseline hazard together with covariate effects or one may wish to add a spatial component for spatially correlated survival data. We propose an extended Cox model, where the (log-)baseline hazard is weakly parameterized using penalized splines and the usual linear predictor is replaced by a structured additive predictor incorporating nonlinear effects of continuous covariates and further time scales, spatial effects, frailty components, and more complex interactions. Inclusion of time-varying coefficients leads to models that relax the proportional hazards assumption. Nonlinear and time-varying effects are modelled through penalized splines, and spatial components are treated as correlated random effects following either a Markov random field or a stationary Gaussian random field. All model components, including smoothing parameters, are specified within a unified framework and are estimated simultaneously based on mixed model methodology. The estimation procedure for such general mixed hazard regression models is derived using penalized likelihood for regression coefficients and (approximate) marginal likelihood for smoothing parameters. Performance of the proposed method is studied through simulation and an application to leukemia survival data in Northwest England

    Regularization and Model Selection with Categorial Predictors and Effect Modifiers in Generalized Linear Models

    Get PDF
    Varying-coefficient models with categorical effect modifiers are considered within the framework of generalized linear models. We distinguish between nominal and ordinal effect modifiers, and propose adequate Lasso-type regularization techniques that allow for (1) selection of relevant covariates, and (2) identification of coefficient functions that are actually varying with the level of a potentially effect modifying factor. We investigate large sample properties, and show in simulation studies that the proposed approaches perform very well for finite samples, too. In addition, the presented methods are compared with alternative procedures, and applied to real-world medical data

    Penalized additive regression for space-time data: a Bayesian perspective

    Get PDF
    We propose extensions of penalized spline generalized additive models for analysing space-time regression data and study them from a Bayesian perspective. Non-linear effects of continuous covariates and time trends are modelled through Bayesian versions of penalized splines, while correlated spatial effects follow a Markov random field prior. This allows to treat all functions and effects within a unified general framework by assigning appropriate priors with different forms and degrees of smoothness. Inference can be performed either with full (FB) or empirical Bayes (EB) posterior analysis. FB inference using MCMC techniques is a slight extension of own previous work. For EB inference, a computationally efficient solution is developed on the basis of a generalized linear mixed model representation. The second approach can be viewed as posterior mode estimation and is closely related to penalized likelihood estimation in a frequentist setting. Variance components, corresponding to smoothing parameters, are then estimated by using marginal likelihood. We carefully compare both inferential procedures in simulation studies and illustrate them through real data applications. The methodology is available in the open domain statistical package BayesX and as an S-plus/R function

    Variable Selection and Model Choice in Geoadditive Regression Models

    Get PDF
    Model choice and variable selection are issues of major concern in practical regression analyses. We propose a boosting procedure that facilitates both tasks in a class of complex geoadditive regression models comprising spatial effects, nonparametric effects of continuous covariates, interaction surfaces, random effects, and varying coefficient terms. The major modelling component are penalized splines and their bivariate tensor product extensions. All smooth model terms are represented as the sum of a parametric component and a remaining smooth component with one degree of freedom to obtain a fair comparison between all model terms. A generic representation of the geoadditive model allows to devise a general boosting algorithm that implements automatic model choice and variable selection. We demonstrate the versatility of our approach with two examples: a geoadditive Poisson regression model for species counts in habitat suitability analyses and a geoadditive logit model for the analysis of forest health

    Geoadditive hazard regression for interval censored survival times

    Get PDF
    The Cox proportional hazards model is the most commonly used method when analyzing the impact of covariates on continuous survival times. In its classical form, the Cox model was introduced in the setting of right-censored observations. However, in practice other sampling schemes are frequently encountered and therefore extensions allowing for interval and left censoring or left truncation are clearly desired. Furthermore, many applications require a more flexible modeling of covariate information than the usual linear predictor. For example, effects of continuous covariates are likely to be of nonlinear form or spatial information is to be included appropriately. Further extensions should allow for time-varying effects of covariates or covariates that are themselves time-varying. Such models relax the assumption of proportional hazards. We propose a regression model for the hazard rate that combines and extends the above-mentioned features on the basis of a unifying Bayesian model formulation. Nonlinear and time-varying effects as well as the baseline hazard rate are modeled by penalized splines. Spatial effects can be included based on either Markov random fields or stationary Gaussian random fields. The model allows for arbitrary combinations of left, right and interval censoring as well as left truncation. Estimation is based on a reparameterisation of the model as a variance components mixed model. The variance parameters corresponding to inverse smoothing parameters can then be estimated based on an approximate marginal likelihood approach. As an application we present an analysis on childhood mortality in Nigeria, where the interval censoring framework also allows to deal with the problem of heaped survival times caused by memory effects. In a simulation study we investigate the effect of ignoring the impact of interval censored observations
    corecore