9,519 research outputs found

    Penalized variable selection procedure for Cox models with semiparametric relative risk

    Full text link
    We study the Cox models with semiparametric relative risk, which can be partially linear with one nonparametric component, or multiple additive or nonadditive nonparametric components. A penalized partial likelihood procedure is proposed to simultaneously estimate the parameters and select variables for both the parametric and the nonparametric parts. Two penalties are applied sequentially. The first penalty, governing the smoothness of the multivariate nonlinear covariate effect function, provides a smoothing spline ANOVA framework that is exploited to derive an empirical model selection tool for the nonparametric part. The second penalty, either the smoothly-clipped-absolute-deviation (SCAD) penalty or the adaptive LASSO penalty, achieves variable selection in the parametric part. We show that the resulting estimator of the parametric part possesses the oracle property, and that the estimator of the nonparametric part achieves the optimal rate of convergence. The proposed procedures are shown to work well in simulation experiments, and then applied to a real data example on sexually transmitted diseases.Comment: Published in at http://dx.doi.org/10.1214/09-AOS780 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Focused information criterion and model averaging for generalized additive partial linear models

    Full text link
    We study model selection and model averaging in generalized additive partial linear models (GAPLMs). Polynomial spline is used to approximate nonparametric functions. The corresponding estimators of the linear parameters are shown to be asymptotically normal. We then develop a focused information criterion (FIC) and a frequentist model average (FMA) estimator on the basis of the quasi-likelihood principle and examine theoretical properties of the FIC and FMA. The major advantages of the proposed procedures over the existing ones are their computational expediency and theoretical reliability. Simulation experiments have provided evidence of the superiority of the proposed procedures. The approach is further applied to a real-world data example.Comment: Published in at http://dx.doi.org/10.1214/10-AOS832 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Semi-parametric regression: Efficiency gains from modeling the nonparametric part

    Full text link
    It is widely admitted that structured nonparametric modeling that circumvents the curse of dimensionality is important in nonparametric estimation. In this paper we show that the same holds for semi-parametric estimation. We argue that estimation of the parametric component of a semi-parametric model can be improved essentially when more structure is put into the nonparametric part of the model. We illustrate this for the partially linear model, and investigate efficiency gains when the nonparametric part of the model has an additive structure. We present the semi-parametric Fisher information bound for estimating the parametric part of the partially linear additive model and provide semi-parametric efficient estimators for which we use a smooth backfitting technique to deal with the additive nonparametric part. We also present the finite sample performances of the proposed estimators and analyze Boston housing data as an illustration.Comment: Published in at http://dx.doi.org/10.3150/10-BEJ296 the Bernoulli (http://isi.cbs.nl/bernoulli/) by the International Statistical Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm

    A goodness-of-fit test for parametric and semi-parametric models in multiresponse regression

    Full text link
    We propose an empirical likelihood test that is able to test the goodness of fit of a class of parametric and semi-parametric multiresponse regression models. The class includes as special cases fully parametric models; semi-parametric models, like the multiindex and the partially linear models; and models with shape constraints. Another feature of the test is that it allows both the response variable and the covariate be multivariate, which means that multiple regression curves can be tested simultaneously. The test also allows the presence of infinite-dimensional nuisance functions in the model to be tested. It is shown that the empirical likelihood test statistic is asymptotically normally distributed under certain mild conditions and permits a wild bootstrap calibration. Despite the large size of the class of models to be considered, the empirical likelihood test enjoys good power properties against departures from a hypothesized model within the class.Comment: Published in at http://dx.doi.org/10.3150/09-BEJ208 the Bernoulli (http://isi.cbs.nl/bernoulli/) by the International Statistical Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm

    Penalized Likelihood and Bayesian Function Selection in Regression Models

    Full text link
    Challenging research in various fields has driven a wide range of methodological advances in variable selection for regression models with high-dimensional predictors. In comparison, selection of nonlinear functions in models with additive predictors has been considered only more recently. Several competing suggestions have been developed at about the same time and often do not refer to each other. This article provides a state-of-the-art review on function selection, focusing on penalized likelihood and Bayesian concepts, relating various approaches to each other in a unified framework. In an empirical comparison, also including boosting, we evaluate several methods through applications to simulated and real data, thereby providing some guidance on their performance in practice

    Propriety of Posteriors in Structured Additive Regression Models: Theory and Empirical Evidence

    Get PDF
    Structured additive regression comprises many semiparametric regression models such as generalized additive (mixed) models, geoadditive models, and hazard regression models within a unified framework. In a Bayesian formulation, nonparametric functions, spatial effects and further model components are specified in terms of multivariate Gaussian priors for high-dimensional vectors of regression coefficients. For several model terms, such as penalised splines or Markov random fields, these Gaussian prior distributions involve rank-deficient precision matrices, yielding partially improper priors. Moreover, hyperpriors for the variances (corresponding to inverse smoothing parameters) may also be specified as improper, e.g. corresponding to Jeffery's prior or a flat prior for the standard deviation. Hence, propriety of the joint posterior is a crucial issue for full Bayesian inference in particular if based on Markov chain Monte Carlo simulations. We establish theoretical results providing sufficient (and sometimes necessary) conditions for propriety and provide empirical evidence through several accompanying simulation studies
    • …
    corecore