In data sets with many more features than observations, independent screening
based on all univariate regression models leads to a computationally convenient
variable selection method. Recent efforts have shown that in the case of
generalized linear models, independent screening may suffice to capture all
relevant features with high probability, even in ultra-high dimension. It is
unclear whether this formal sure screening property is attainable when the
response is a right-censored survival time. We propose a computationally very
efficient independent screening method for survival data which can be viewed as
the natural survival equivalent of correlation screening. We state conditions
under which the method admits the sure screening property within a general
class of single-index hazard rate models with ultra-high dimensional features.
An iterative variant is also described which combines screening with penalized
regression in order to handle more complex feature covariance structures. The
methods are evaluated through simulation studies and through application to a
real gene expression dataset.Comment: 32 pages, 3 figure

Aalen

Bair

Benner

Bradic

Breiman

Brillinger

Cheng

Friedman

Gorst-Rasmussen

Hall

Hardin

Hattori

Hoeffding

Leng

Martinussen

Massart

McKeague

Meinshausen

Metzeler

Pollard

Robert

Struthers

Tibshirani

Wang

English

arXiv

Gorst-Rasmussen, Anders

Scheike, Thomas

Copenhagen University Research Information System

Independent screening for single-index hazard rate models with ultrahigh dimensional features

Summary
               In data sets with many more features than observations, independent screening based on all univariate regression models leads to a computationally convenient variable selection method. Recent efforts have shown that, in the case of generalized linear models, independent screening may suffice to capture all relevant features with high probability, even in ultrahigh dimension. It is unclear whether this formal sure screening property is attainable when the response is a right-censored survival time. We propose a computationally very efficient independent screening method for survival data which can be viewed as the natural survival equivalent of correlation screening. We state conditions under which the method admits the sure screening property within a class of single-index hazard rate models with ultrahigh dimensional features and describe the generally detrimental effect of censoring on performance. An iterative variant of the method is also described which combines screening with penalized regression to handle more complex feature covariance structures. The methodology is evaluated through simulation studies and through application to a real gene expression data set.</jats:p

Anders Gorst-Rasmussen

Thomas Scheike

Crossref

Journal of the Royal Statistical Society Series B (Statistical Methodology)

Independent Screening for Single-Index Hazard rate Models with Ultrahigh Dimensional Features

Independent screening for single-index hazard rate models with ultra-high dimensional features

Independent screening for single-index hazard rate models with
  ultra-high dimensional features

Independent screening for single-index hazard rate models with ultra-high dimensional features

Abstract

Similar works

Full text

Available Versions

Copenhagen University Research Information System

VBN

Crossref

VBN