373 research outputs found

    Bayesian Regularisation in Structured Additive Regression Models for Survival Data

    Get PDF
    During recent years, penalized likelihood approaches have attracted a lot of interest both in the area of semiparametric regression and for the regularization of high-dimensional regression models. In this paper, we introduce a Bayesian formulation that allows to combine both aspects into a joint regression model with a focus on hazard regression for survival times. While Bayesian penalized splines form the basis for estimating nonparametric and flexible time-varying effects, regularization of high-dimensional covariate vectors is based on scale mixture of normals priors. This class of priors allows to keep a (conditional) Gaussian prior for regression coefficients on the predictor stage of the model but introduces suitable mixture distributions for the Gaussian variance to achieve regularization. This scale mixture property allows to device general and adaptive Markov chain Monte Carlo simulation algorithms for fitting a variety of hazard regression models. In particular, unifying algorithms based on iteratively weighted least squares proposals can be employed both for regularization and penalized semiparametric function estimation. Since sampling based estimates do no longer have the variable selection property well-known for the Lasso in frequentist analyses, we additionally consider spike and slab priors that introduce a further mixing stage that allows to separate between influential and redundant parameters. We demonstrate the different shrinkage properties with three simulation settings and apply the methods to the PBC Liver dataset

    DPpackage: Bayesian Semi- and Nonparametric Modeling in R

    Get PDF
    Data analysis sometimes requires the relaxation of parametric assumptions in order to gain modeling flexibility and robustness against mis-specification of the probability model. In the Bayesian context, this is accomplished by placing a prior distribution on a function space, such as the space of all probability distributions or the space of all regression functions. Unfortunately, posterior distributions ranging over function spaces are highly complex and hence sampling methods play a key role. This paper provides an introduction to a simple, yet comprehensive, set of programs for the implementation of some Bayesian nonparametric and semiparametric models in R, DPpackage. Currently, DPpackage includes models for marginal and conditional density estimation, receiver operating characteristic curve analysis, interval-censored data, binary regression data, item response data, longitudinal and clustered data using generalized linear mixed models, and regression data using generalized additive models. The package also contains functions to compute pseudo-Bayes factors for model comparison and for eliciting the precision parameter of the Dirichlet process prior, and a general purpose Metropolis sampling algorithm. To maximize computational efficiency, the actual sampling for each model is carried out using compiled C, C++ or Fortran code.

    Joint Dispersion Model with a Flexible Link

    Get PDF
    The objective is to model longitudinal and survival data jointly taking into account the dependence between the two responses in a real HIV/AIDS dataset using a shared parameter approach inside a Bayesian framework. We propose a linear mixed effects dispersion model to adjust the CD4 longitudinal biomarker data with a between-individual heterogeneity in the mean and variance. In doing so we are relaxing the usual assumption of a common variance for the longitudinal residuals. A hazard regression model is considered in addition to model the time since HIV/AIDS diagnostic until failure, being the coefficients, accounting for the linking between the longitudinal and survival processes, time-varying. This flexibility is specified using Penalized Splines and allows the relationship to vary in time. Because heteroscedasticity may be related with the survival, the standard deviation is considered as a covariate in the hazard model, thus enabling to study the effect of the CD4 counts' stability on the survival. The proposed framework outperforms the most used joint models, highlighting the importance in correctly taking account the individual heterogeneity for the measurement errors variance and the evolution of the disease over time in bringing new insights to better understand this biomarker-survival relation.Comment: 27 pages, 3 figures, 2 table
    • …
    corecore