Location of Repository

New Statistical Issues for Censored Survival Data: High-Dimensionality and Censored Covariate.

By Shengchun Kong


Censored survival data arise commonly in many areas including epidemiology, engineering and sociology. In this dissertation, we explore several emerging statistical issues for censored survival data. In Chapter 2, we consider finite sample properties of the regularized high-dimensional Cox regression via lasso. Existing literature focuses on linear or generalized linear models with Lipschitz loss functions, where the empirical risk functions are the summations of independent and identically distributed (iid) losses. The summands in the negative log partial likelihood function for censored survival data, however, are neither iid nor Lipschitz. We first approximate the negative log partial likelihood function by a sum of iid non-Lipschitz terms, then derive the non-asymptotic oracle inequalities for the lasso penalized Cox regression, using pointwise arguments to tackle the difficulties caused by lacking iid Lipschitz losses. In Chapter 3, we consider generalized linear regression analysis with a left-censored covariate due to the limit of detection. The complete case analysis yields valid estimates for regression coefficients, but loses efficiency. Substitution methods are biased; the maximum likelihood method relies on parametric models for the unobservable tail probability, thus may suffer from model misspecification. To obtain robust and more efficient results, we propose a semiparametric likelihood-based approach for the regression parameters using an accelerated failure time model for the left-censored covariate. A two-stage estimation procedure is considered. The proposed method outperforms the existing methods in simulation studies. Technical conditions for asymptotic properties are provided. In Chapter 4, we consider longitudinal data analysis with a terminal event. The existing methods include the joint modeling approach and the marginal estimating equation approach, and both assume that the relationship between the response variable and a set of covariates is the same no matter whether the terminal event occurs or not. This assumption, however, is not reasonable for many longitudinal studies. Therefore we directly model event time as a covariate, which provides intuitive interpretation. When the terminal event times are right-censored, a semiparametric likelihood-based approach similar to Chapter 3 is proposed for the parameter estimations. The proposed method outperforms the complete case analysis in simulation studies and its asymptotic properties are provided

Topics: Cox Regression, Finite Sample, Lasso, Oracle Inequality, Variable Selection, Accelerate Failure Time Model; Censored Covariate; Empirical Process; Generalized Linear Models; Pseudo-likelihood Estimation., Mixed Effects Model; Cox Regression; Empirical Process; Pseudo-maximum Likelihood Estimation
Year: 2014
OAI identifier: oai:deepblue.lib.umich.edu:2027.42/108930

Suggested articles


To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.