212 research outputs found
SCAD-penalized regression in high-dimensional partially linear models
We consider the problem of simultaneous variable selection and estimation in
partially linear models with a divergent number of covariates in the linear
part, under the assumption that the vector of regression coefficients is
sparse. We apply the SCAD penalty to achieve sparsity in the linear part and
use polynomial splines to estimate the nonparametric component. Under
reasonable conditions, it is shown that consistency in terms of variable
selection and estimation can be achieved simultaneously for the linear and
nonparametric components. Furthermore, the SCAD-penalized estimators of the
nonzero coefficients are shown to have the asymptotic oracle property, in the
sense that it is asymptotically normal with the same means and covariances that
they would have if the zero coefficients were known in advance. The finite
sample behavior of the SCAD-penalized estimators is evaluated with simulation
and illustrated with a data set.Comment: Published in at http://dx.doi.org/10.1214/07-AOS580 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Scalable Sparse Cox's Regression for Large-Scale Survival Data via Broken Adaptive Ridge
This paper develops a new scalable sparse Cox regression tool for sparse
high-dimensional massive sample size (sHDMSS) survival data. The method is a
local -penalized Cox regression via repeatedly performing reweighted
-penalized Cox regression. We show that the resulting estimator enjoys the
best of - and -penalized Cox regressions while overcoming their
limitations. Specifically, the estimator is selection consistent, oracle for
parameter estimation, and possesses a grouping property for highly correlated
covariates. Simulation results suggest that when the sample size is large, the
proposed method with pre-specified tuning parameters has a comparable or better
performance than some popular penalized regression methods. More importantly,
because the method naturally enables adaptation of efficient algorithms for
massive -penalized optimization and does not require costly data driven
tuning parameter selection, it has a significant computational advantage for
sHDMSS data, offering an average of 5-fold speedup over its closest competitor
in empirical studies
Bayesian variable selection with shrinking and diffusing priors
We consider a Bayesian approach to variable selection in the presence of high
dimensional covariates based on a hierarchical model that places prior
distributions on the regression coefficients as well as on the model space. We
adopt the well-known spike and slab Gaussian priors with a distinct feature,
that is, the prior variances depend on the sample size through which
appropriate shrinkage can be achieved. We show the strong selection consistency
of the proposed method in the sense that the posterior probability of the true
model converges to one even when the number of covariates grows nearly
exponentially with the sample size. This is arguably the strongest selection
consistency result that has been available in the Bayesian variable selection
literature; yet the proposed method can be carried out through posterior
sampling with a simple Gibbs sampler. Furthermore, we argue that the proposed
method is asymptotically similar to model selection with the penalty. We
also demonstrate through empirical work the fine performance of the proposed
approach relative to some state of the art alternatives.Comment: Published in at http://dx.doi.org/10.1214/14-AOS1207 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Nonconcave penalized likelihood with a diverging number of parameters
A class of variable selection procedures for parametric models via nonconcave
penalized likelihood was proposed by Fan and Li to simultaneously estimate
parameters and select important variables. They demonstrated that this class of
procedures has an oracle property when the number of parameters is finite.
However, in most model selection problems the number of parameters should be
large and grow with the sample size. In this paper some asymptotic properties
of the nonconcave penalized likelihood are established for situations in which
the number of parameters tends to \infty as the sample size increases.
Under regularity conditions we have established an oracle property and the
asymptotic normality of the penalized likelihood estimators. Furthermore, the
consistency of the sandwich formula of the covariance matrix is demonstrated.
Nonconcave penalized likelihood ratio statistics are discussed, and their
asymptotic distributions under the null hypothesis are obtained by imposing
some mild conditions on the penalty functions
Discussion: One-step sparse estimates in nonconcave penalized likelihood models
Discussion of ``One-step sparse estimates in nonconcave penalized likelihood
models'' [arXiv:0808.1012]Comment: Published in at http://dx.doi.org/10.1214/07-AOS0316C the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
- β¦