3,257 research outputs found
New Statistical Issues for Censored Survival Data: High-Dimensionality and Censored Covariate.
Censored survival data arise commonly in many areas including epidemiology, engineering and sociology. In this dissertation, we explore several emerging statistical issues for censored survival data.
In Chapter 2, we consider finite sample properties
of the regularized high-dimensional Cox regression via lasso. Existing literature focuses on linear or generalized linear models with Lipschitz loss functions, where the empirical risk functions are the summations of independent and identically distributed (iid) losses. The summands in the negative log partial likelihood function for censored survival data, however, are neither iid nor Lipschitz. We first approximate the negative log partial likelihood function by
a sum of iid non-Lipschitz terms, then derive the non-asymptotic oracle inequalities for the lasso penalized Cox regression, using pointwise arguments to tackle the difficulties caused by lacking iid Lipschitz losses.
In Chapter 3, we consider generalized linear regression analysis with a left-censored covariate due to the limit of detection. The complete case analysis yields valid estimates for
regression coefficients, but loses efficiency. Substitution methods are biased; the maximum likelihood method relies on parametric models for the unobservable tail probability, thus may suffer from model misspecification. To obtain robust and more efficient results, we propose a semiparametric likelihood-based approach for the
regression parameters using an accelerated failure time model for the left-censored covariate. A two-stage estimation procedure is
considered. The proposed method outperforms the existing methods in simulation studies. Technical conditions for asymptotic properties are provided.
In Chapter 4, we consider longitudinal data
analysis with a terminal event. The existing methods include the joint modeling approach and the marginal estimating equation approach, and both assume that the relationship between the response variable and a set of covariates is the same no matter whether the terminal event occurs or not. This assumption, however, is not reasonable for many longitudinal studies. Therefore we directly model event time as a covariate, which provides intuitive interpretation. When the terminal event times are right-censored, a semiparametric likelihood-based approach similar to Chapter 3 is proposed for the parameter estimations. The proposed method outperforms the complete case analysis in simulation studies and its asymptotic properties are provided.PhDBiostatisticsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/108930/1/kongsc_1.pd
A sieve M-theorem for bundled parameters in semiparametric models, with application to the efficient estimation in a linear model for censored data
In many semiparametric models that are parameterized by two types of
parameters---a Euclidean parameter of interest and an infinite-dimensional
nuisance parameter---the two parameters are bundled together, that is, the
nuisance parameter is an unknown function that contains the parameter of
interest as part of its argument. For example, in a linear regression model for
censored survival data, the unspecified error distribution function involves
the regression coefficients. Motivated by developing an efficient estimating
method for the regression parameters, we propose a general sieve M-theorem for
bundled parameters and apply the theorem to deriving the asymptotic theory for
the sieve maximum likelihood estimation in the linear regression model for
censored survival data. The numerical implementation of the proposed estimating
method can be achieved through the conventional gradient-based search
algorithms such as the Newton--Raphson algorithm. We show that the proposed
estimator is consistent and asymptotically normal and achieves the
semiparametric efficiency bound. Simulation studies demonstrate that the
proposed method performs well in practical settings and yields more efficient
estimates than existing estimating equation based methods. Illustration with a
real data example is also provided.Comment: Published in at http://dx.doi.org/10.1214/11-AOS934 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Most Likely Transformations
We propose and study properties of maximum likelihood estimators in the class
of conditional transformation models. Based on a suitable explicit
parameterisation of the unconditional or conditional transformation function,
we establish a cascade of increasingly complex transformation models that can
be estimated, compared and analysed in the maximum likelihood framework. Models
for the unconditional or conditional distribution function of any univariate
response variable can be set-up and estimated in the same theoretical and
computational framework simply by choosing an appropriate transformation
function and parameterisation thereof. The ability to evaluate the distribution
function directly allows us to estimate models based on the exact likelihood,
especially in the presence of random censoring or truncation. For discrete and
continuous responses, we establish the asymptotic normality of the proposed
estimators. A reference software implementation of maximum likelihood-based
estimation for conditional transformation models allowing the same flexibility
as the theory developed here was employed to illustrate the wide range of
possible applications.Comment: Accepted for publication by the Scandinavian Journal of Statistics
2017-06-1
A simple GMM estimator for the semi-parametric mixed proportional hazard model
Ridder and Woutersen (2003) have shown that under a weak condition on the baseline hazard, there exist root-N consistent estimators of the parameters in a semiparametric Mixed Proportional Hazard model with a parametric baseline hazard and unspeci�ed distribution of the unobserved heterogeneity. We extend the Linear Rank Estimator (LRE) of Tsiatis (1990) and Robins and Tsiatis
(1991) to this class of models. The optimal LRE is a two-step estimator. We propose a simple one-step estimator that is close to optimal if there is no unobserved heterogeneity. The e¢ ciency gain associated with the optimal LRE increases with the degree of unobserved heterogeneity.
Maximum likelihood estimation in a partially observed stratified regression model with censored data
The stratified proportional intensity model generalizes Cox's proportional
intensity model by allowing different groups of the population under study to
have distinct baseline intensity functions. In this article, we consider the
problem of estimation in this model when the variable indicating the stratum is
unobserved for some individuals in the studied sample. In this setting, we
construct nonparametric maximum likelihood estimators for the parameters of the
stratified model and we establish their consistency and asymptotic normality.
Consistent estimators for the limiting variances are also obtained
Some Notes on Sample Selection Models
Sample selection problems are pervasive when working with micro economic models and datasets of individuals, households or firms. During the last three decades, there have been very significant developments in this area of econometrics. Different type of models have been proposed and used in empirical applications. And new estimation and inference methods, both parametric and semiparametric, have been developed. These notes provide a brief introduction to this large literature.Sample selection. Censored regression model. Truncated regression model. Treatment effects. Semiparametric methods.
- …