9,519 research outputs found
Penalized variable selection procedure for Cox models with semiparametric relative risk
We study the Cox models with semiparametric relative risk, which can be
partially linear with one nonparametric component, or multiple additive or
nonadditive nonparametric components. A penalized partial likelihood procedure
is proposed to simultaneously estimate the parameters and select variables for
both the parametric and the nonparametric parts. Two penalties are applied
sequentially. The first penalty, governing the smoothness of the multivariate
nonlinear covariate effect function, provides a smoothing spline ANOVA
framework that is exploited to derive an empirical model selection tool for the
nonparametric part. The second penalty, either the
smoothly-clipped-absolute-deviation (SCAD) penalty or the adaptive LASSO
penalty, achieves variable selection in the parametric part. We show that the
resulting estimator of the parametric part possesses the oracle property, and
that the estimator of the nonparametric part achieves the optimal rate of
convergence. The proposed procedures are shown to work well in simulation
experiments, and then applied to a real data example on sexually transmitted
diseases.Comment: Published in at http://dx.doi.org/10.1214/09-AOS780 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Focused information criterion and model averaging for generalized additive partial linear models
We study model selection and model averaging in generalized additive partial
linear models (GAPLMs). Polynomial spline is used to approximate nonparametric
functions. The corresponding estimators of the linear parameters are shown to
be asymptotically normal. We then develop a focused information criterion (FIC)
and a frequentist model average (FMA) estimator on the basis of the
quasi-likelihood principle and examine theoretical properties of the FIC and
FMA. The major advantages of the proposed procedures over the existing ones are
their computational expediency and theoretical reliability. Simulation
experiments have provided evidence of the superiority of the proposed
procedures. The approach is further applied to a real-world data example.Comment: Published in at http://dx.doi.org/10.1214/10-AOS832 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Semi-parametric regression: Efficiency gains from modeling the nonparametric part
It is widely admitted that structured nonparametric modeling that circumvents
the curse of dimensionality is important in nonparametric estimation. In this
paper we show that the same holds for semi-parametric estimation. We argue that
estimation of the parametric component of a semi-parametric model can be
improved essentially when more structure is put into the nonparametric part of
the model. We illustrate this for the partially linear model, and investigate
efficiency gains when the nonparametric part of the model has an additive
structure. We present the semi-parametric Fisher information bound for
estimating the parametric part of the partially linear additive model and
provide semi-parametric efficient estimators for which we use a smooth
backfitting technique to deal with the additive nonparametric part. We also
present the finite sample performances of the proposed estimators and analyze
Boston housing data as an illustration.Comment: Published in at http://dx.doi.org/10.3150/10-BEJ296 the Bernoulli
(http://isi.cbs.nl/bernoulli/) by the International Statistical
Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm
A goodness-of-fit test for parametric and semi-parametric models in multiresponse regression
We propose an empirical likelihood test that is able to test the goodness of
fit of a class of parametric and semi-parametric multiresponse regression
models. The class includes as special cases fully parametric models;
semi-parametric models, like the multiindex and the partially linear models;
and models with shape constraints. Another feature of the test is that it
allows both the response variable and the covariate be multivariate, which
means that multiple regression curves can be tested simultaneously. The test
also allows the presence of infinite-dimensional nuisance functions in the
model to be tested. It is shown that the empirical likelihood test statistic is
asymptotically normally distributed under certain mild conditions and permits a
wild bootstrap calibration. Despite the large size of the class of models to be
considered, the empirical likelihood test enjoys good power properties against
departures from a hypothesized model within the class.Comment: Published in at http://dx.doi.org/10.3150/09-BEJ208 the Bernoulli
(http://isi.cbs.nl/bernoulli/) by the International Statistical
Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm
Penalized Likelihood and Bayesian Function Selection in Regression Models
Challenging research in various fields has driven a wide range of
methodological advances in variable selection for regression models with
high-dimensional predictors. In comparison, selection of nonlinear functions in
models with additive predictors has been considered only more recently. Several
competing suggestions have been developed at about the same time and often do
not refer to each other. This article provides a state-of-the-art review on
function selection, focusing on penalized likelihood and Bayesian concepts,
relating various approaches to each other in a unified framework. In an
empirical comparison, also including boosting, we evaluate several methods
through applications to simulated and real data, thereby providing some
guidance on their performance in practice
Propriety of Posteriors in Structured Additive Regression Models: Theory and Empirical Evidence
Structured additive regression comprises many semiparametric regression models such as generalized additive (mixed) models, geoadditive models, and hazard regression models within a unified framework. In a Bayesian formulation, nonparametric functions, spatial effects and further model components are specified in terms of multivariate Gaussian priors for high-dimensional vectors of regression coefficients. For several model terms, such as penalised splines or Markov random fields, these Gaussian prior distributions involve rank-deficient precision matrices, yielding partially improper priors. Moreover, hyperpriors for the variances (corresponding to inverse smoothing parameters) may also be specified as improper, e.g. corresponding to Jeffery's prior or a flat prior for the standard deviation. Hence, propriety of the joint posterior is a crucial issue for full Bayesian inference in particular if based on Markov chain Monte Carlo simulations. We establish theoretical results providing sufficient (and sometimes necessary) conditions for propriety and provide empirical evidence through several accompanying simulation studies
- …