37,424 research outputs found
Bayesian Regularisation in Structured Additive Regression Models for Survival Data
During recent years, penalized likelihood approaches have attracted a lot of interest both in the area of semiparametric regression and for the regularization of high-dimensional regression models. In this paper, we introduce a Bayesian formulation that allows to combine both aspects into a joint regression model with a focus on hazard regression for survival times. While Bayesian penalized splines form the basis for estimating nonparametric and flexible time-varying effects, regularization of high-dimensional covariate vectors is based on scale mixture of normals priors. This class of priors allows to keep a (conditional) Gaussian prior for regression coefficients on the predictor stage of the model but introduces suitable mixture distributions for the Gaussian variance to achieve regularization. This scale mixture property allows to device general and adaptive Markov chain Monte Carlo simulation algorithms for fitting a variety of hazard regression models. In particular, unifying algorithms based on iteratively weighted least squares proposals can be employed both for regularization and penalized semiparametric function estimation. Since sampling based estimates do no longer have the variable selection property well-known for the Lasso in frequentist analyses, we additionally consider spike and slab priors that introduce a further mixing stage that allows to separate between influential and redundant parameters. We demonstrate the different shrinkage properties with three simulation settings and apply the methods to the PBC Liver dataset
A mixed model approach for structured hazard regression
The classical Cox proportional hazards model is a benchmark approach to analyze continuous survival times in the presence of covariate information. In a number of applications, there is a need to relax one or more of its inherent assumptions, such as linearity of the predictor or the proportional hazards property. Also, one is often interested in jointly estimating the baseline hazard together with covariate effects or one may wish to add a spatial component for spatially correlated survival data. We propose an extended Cox model, where the (log-)baseline hazard is weakly parameterized using penalized splines and the usual linear predictor is replaced by a structured additive predictor incorporating nonlinear effects of continuous covariates and further time scales, spatial effects, frailty components, and more complex interactions. Inclusion of time-varying coefficients leads to models that relax the proportional hazards assumption. Nonlinear and time-varying effects are modelled through penalized splines, and spatial components are treated as correlated random effects following either a Markov random field or a stationary Gaussian random field. All model components, including smoothing parameters, are specified within a unified framework and are estimated simultaneously based on mixed model methodology. The estimation procedure for such general mixed hazard regression models is derived using penalized likelihood for regression coefficients and (approximate) marginal likelihood for smoothing parameters. Performance of the proposed method is studied through simulation and an application to leukemia survival data in Northwest England
Functional Regression
Functional data analysis (FDA) involves the analysis of data whose ideal
units of observation are functions defined on some continuous domain, and the
observed data consist of a sample of functions taken from some population,
sampled on a discrete grid. Ramsay and Silverman's 1997 textbook sparked the
development of this field, which has accelerated in the past 10 years to become
one of the fastest growing areas of statistics, fueled by the growing number of
applications yielding this type of data. One unique characteristic of FDA is
the need to combine information both across and within functions, which Ramsay
and Silverman called replication and regularization, respectively. This article
will focus on functional regression, the area of FDA that has received the most
attention in applications and methodological development. First will be an
introduction to basis functions, key building blocks for regularization in
functional regression methods, followed by an overview of functional regression
methods, split into three types: [1] functional predictor regression
(scalar-on-function), [2] functional response regression (function-on-scalar)
and [3] function-on-function regression. For each, the role of replication and
regularization will be discussed and the methodological development described
in a roughly chronological manner, at times deviating from the historical
timeline to group together similar methods. The primary focus is on modeling
and methodology, highlighting the modeling structures that have been developed
and the various regularization approaches employed. At the end is a brief
discussion describing potential areas of future development in this field
Penalized Likelihood and Bayesian Function Selection in Regression Models
Challenging research in various fields has driven a wide range of
methodological advances in variable selection for regression models with
high-dimensional predictors. In comparison, selection of nonlinear functions in
models with additive predictors has been considered only more recently. Several
competing suggestions have been developed at about the same time and often do
not refer to each other. This article provides a state-of-the-art review on
function selection, focusing on penalized likelihood and Bayesian concepts,
relating various approaches to each other in a unified framework. In an
empirical comparison, also including boosting, we evaluate several methods
through applications to simulated and real data, thereby providing some
guidance on their performance in practice
Normal-Mixture-of-Inverse-Gamma Priors for Bayesian Regularization and Model Selection in Structured Additive Regression Models
In regression models with many potential predictors, choosing an appropriate subset of covariates and their interactions at the same time as determining whether linear or more flexible functional forms are required is a challenging and important task. We propose a spike-and-slab prior structure in order to include or exclude single coefficients as well as blocks of coefficients associated
with factor variables, random effects or basis expansions
of smooth functions. Structured additive models with this prior structure are estimated with Markov Chain Monte Carlo using a redundant multiplicative parameter expansion. We discuss shrinkage properties of the novel prior induced by the redundant parameterization, investigate its sensitivity to hyperparameter settings and compare performance of the proposed method in terms of model selection, sparsity recovery, and estimation error for Gaussian, binomial and Poisson responses on real and simulated data sets with that of component-wise boosting and other approaches
Spike-and-Slab Priors for Function Selection in Structured Additive Regression Models
Structured additive regression provides a general framework for complex
Gaussian and non-Gaussian regression models, with predictors comprising
arbitrary combinations of nonlinear functions and surfaces, spatial effects,
varying coefficients, random effects and further regression terms. The large
flexibility of structured additive regression makes function selection a
challenging and important task, aiming at (1) selecting the relevant
covariates, (2) choosing an appropriate and parsimonious representation of the
impact of covariates on the predictor and (3) determining the required
interactions. We propose a spike-and-slab prior structure for function
selection that allows to include or exclude single coefficients as well as
blocks of coefficients representing specific model terms. A novel
multiplicative parameter expansion is required to obtain good mixing and
convergence properties in a Markov chain Monte Carlo simulation approach and is
shown to induce desirable shrinkage properties. In simulation studies and with
(real) benchmark classification data, we investigate sensitivity to
hyperparameter settings and compare performance to competitors. The flexibility
and applicability of our approach are demonstrated in an additive piecewise
exponential model with time-varying effects for right-censored survival times
of intensive care patients with sepsis. Geoadditive and additive mixed logit
model applications are discussed in an extensive appendix
A Bayesian Approach to Sparse plus Low rank Network Identification
We consider the problem of modeling multivariate time series with
parsimonious dynamical models which can be represented as sparse dynamic
Bayesian networks with few latent nodes. This structure translates into a
sparse plus low rank model. In this paper, we propose a Gaussian regression
approach to identify such a model
- …