3,257 research outputs found

    New Statistical Issues for Censored Survival Data: High-Dimensionality and Censored Covariate.

    Full text link
    Censored survival data arise commonly in many areas including epidemiology, engineering and sociology. In this dissertation, we explore several emerging statistical issues for censored survival data. In Chapter 2, we consider finite sample properties of the regularized high-dimensional Cox regression via lasso. Existing literature focuses on linear or generalized linear models with Lipschitz loss functions, where the empirical risk functions are the summations of independent and identically distributed (iid) losses. The summands in the negative log partial likelihood function for censored survival data, however, are neither iid nor Lipschitz. We first approximate the negative log partial likelihood function by a sum of iid non-Lipschitz terms, then derive the non-asymptotic oracle inequalities for the lasso penalized Cox regression, using pointwise arguments to tackle the difficulties caused by lacking iid Lipschitz losses. In Chapter 3, we consider generalized linear regression analysis with a left-censored covariate due to the limit of detection. The complete case analysis yields valid estimates for regression coefficients, but loses efficiency. Substitution methods are biased; the maximum likelihood method relies on parametric models for the unobservable tail probability, thus may suffer from model misspecification. To obtain robust and more efficient results, we propose a semiparametric likelihood-based approach for the regression parameters using an accelerated failure time model for the left-censored covariate. A two-stage estimation procedure is considered. The proposed method outperforms the existing methods in simulation studies. Technical conditions for asymptotic properties are provided. In Chapter 4, we consider longitudinal data analysis with a terminal event. The existing methods include the joint modeling approach and the marginal estimating equation approach, and both assume that the relationship between the response variable and a set of covariates is the same no matter whether the terminal event occurs or not. This assumption, however, is not reasonable for many longitudinal studies. Therefore we directly model event time as a covariate, which provides intuitive interpretation. When the terminal event times are right-censored, a semiparametric likelihood-based approach similar to Chapter 3 is proposed for the parameter estimations. The proposed method outperforms the complete case analysis in simulation studies and its asymptotic properties are provided.PhDBiostatisticsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/108930/1/kongsc_1.pd

    A sieve M-theorem for bundled parameters in semiparametric models, with application to the efficient estimation in a linear model for censored data

    Full text link
    In many semiparametric models that are parameterized by two types of parameters---a Euclidean parameter of interest and an infinite-dimensional nuisance parameter---the two parameters are bundled together, that is, the nuisance parameter is an unknown function that contains the parameter of interest as part of its argument. For example, in a linear regression model for censored survival data, the unspecified error distribution function involves the regression coefficients. Motivated by developing an efficient estimating method for the regression parameters, we propose a general sieve M-theorem for bundled parameters and apply the theorem to deriving the asymptotic theory for the sieve maximum likelihood estimation in the linear regression model for censored survival data. The numerical implementation of the proposed estimating method can be achieved through the conventional gradient-based search algorithms such as the Newton--Raphson algorithm. We show that the proposed estimator is consistent and asymptotically normal and achieves the semiparametric efficiency bound. Simulation studies demonstrate that the proposed method performs well in practical settings and yields more efficient estimates than existing estimating equation based methods. Illustration with a real data example is also provided.Comment: Published in at http://dx.doi.org/10.1214/11-AOS934 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Most Likely Transformations

    Full text link
    We propose and study properties of maximum likelihood estimators in the class of conditional transformation models. Based on a suitable explicit parameterisation of the unconditional or conditional transformation function, we establish a cascade of increasingly complex transformation models that can be estimated, compared and analysed in the maximum likelihood framework. Models for the unconditional or conditional distribution function of any univariate response variable can be set-up and estimated in the same theoretical and computational framework simply by choosing an appropriate transformation function and parameterisation thereof. The ability to evaluate the distribution function directly allows us to estimate models based on the exact likelihood, especially in the presence of random censoring or truncation. For discrete and continuous responses, we establish the asymptotic normality of the proposed estimators. A reference software implementation of maximum likelihood-based estimation for conditional transformation models allowing the same flexibility as the theory developed here was employed to illustrate the wide range of possible applications.Comment: Accepted for publication by the Scandinavian Journal of Statistics 2017-06-1

    A simple GMM estimator for the semi-parametric mixed proportional hazard model

    Get PDF
    Ridder and Woutersen (2003) have shown that under a weak condition on the baseline hazard, there exist root-N consistent estimators of the parameters in a semiparametric Mixed Proportional Hazard model with a parametric baseline hazard and unspeci�ed distribution of the unobserved heterogeneity. We extend the Linear Rank Estimator (LRE) of Tsiatis (1990) and Robins and Tsiatis (1991) to this class of models. The optimal LRE is a two-step estimator. We propose a simple one-step estimator that is close to optimal if there is no unobserved heterogeneity. The e¢ ciency gain associated with the optimal LRE increases with the degree of unobserved heterogeneity.

    Maximum likelihood estimation in a partially observed stratified regression model with censored data

    Get PDF
    The stratified proportional intensity model generalizes Cox's proportional intensity model by allowing different groups of the population under study to have distinct baseline intensity functions. In this article, we consider the problem of estimation in this model when the variable indicating the stratum is unobserved for some individuals in the studied sample. In this setting, we construct nonparametric maximum likelihood estimators for the parameters of the stratified model and we establish their consistency and asymptotic normality. Consistent estimators for the limiting variances are also obtained

    Some Notes on Sample Selection Models

    Get PDF
    Sample selection problems are pervasive when working with micro economic models and datasets of individuals, households or firms. During the last three decades, there have been very significant developments in this area of econometrics. Different type of models have been proposed and used in empirical applications. And new estimation and inference methods, both parametric and semiparametric, have been developed. These notes provide a brief introduction to this large literature.Sample selection. Censored regression model. Truncated regression model. Treatment effects. Semiparametric methods.
    • …
    corecore