High throughput genetic sequencing arrays with thousands of measurements per
sample and a great amount of related censored clinical data have increased
demanding need for better measurement specific model selection. In this paper
we establish strong oracle properties of nonconcave penalized methods for
nonpolynomial (NP) dimensional data with censoring in the framework of Cox's
proportional hazards model. A class of folded-concave penalties are employed
and both LASSO and SCAD are discussed specifically. We unveil the question
under which dimensionality and correlation restrictions can an oracle estimator
be constructed and grasped. It is demonstrated that nonconcave penalties lead
to significant reduction of the "irrepresentable condition" needed for LASSO
model selection consistency. The large deviation result for martingales,
bearing interests of its own, is developed for characterizing the strong oracle
property. Moreover, the nonconcave regularized estimator, is shown to achieve
asymptotically the information bound of the oracle estimator. A coordinate-wise
algorithm is developed for finding the grid of solution paths for penalized
hazard regression problems, and its performance is evaluated on simulated and
gene association study examples.Comment: Published in at http://dx.doi.org/10.1214/11-AOS911 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org