55 research outputs found

    Novel Computational Methods for Censored Data and Regression

    Get PDF
    This dissertation can be divided into three topics. In the first topic, we derived a recursive algorithm for the constrained Kaplan-Meier estimator, which promotes the computation speed up to fifty times compared to the current method that uses EM algorithm. We also showed how this leads to the vast improvement of empirical likelihood analysis with right censored data. After a brief review of regularized regressions, we investigated the computational problems in the parametric/non-parametric hybrid accelerated failure time models and its regularization in a high dimensional setting. We also illustrated that, when the number of pieces increases, the discussed models are close to a nonparametric one. In the last topic, we discussed a semi-parametric approach of hypothesis testing problem in the binary choice model. The major tools used are Buckley-James like algorithm and empirical likelihood. The essential idea, which is similar to the first topic, is iteratively computing linear constrained empirical likelihood using optimization algorithms including EM, and iterative convex minorant algorithm

    Event History Analysis in Longitudinal Cohort Studies with Intermittent Inspection Times

    Get PDF
    Event history studies based on disease clinic data often face several complications. Specifically, patients visit the clinic irregularly, and the intermittent inspection times depend on the history of disease-related variables; this can cause event or failure times to be dependently interval-censored. Furthermore, failure times could be truncated, treatment assignment is non-randomized and can be confounded, and there are competing risks of the failure time outcomes under study. I propose a class of inverse probability weights applied to estimating functions so that the informative inspection scheme and confounded treatment are appropriately dealt with. As a result, the distribution of failure time outcomes can be consistently estimated. I consider parametric, non- and semi-parametric estimation. Monotone smoothing techniques are employed in a two-stage estimation procedure for the non- or semi-parametric estimation. Simulations for a variety of failure time models are conducted for examining the finite sample performances of proposed estimators. This research is initially motivated by the Psoriatic Arthritis (PsA) Toronto Cohort Study at the Toronto Western Hospital and the proposed methodologies are applied to this cohort study as an illustration

    A Provable Smoothing Approach for High Dimensional Generalized Regression with Applications in Genomics

    Get PDF
    In many applications, linear models fit the data poorly. This article studies an appealing alternative, the generalized regression model. This model only assumes that there exists an unknown monotonically increasing link function connecting the response YY to a single index XTβ∗X^T\beta^* of explanatory variables X∈RdX\in\mathbb{R}^d. The generalized regression model is flexible and covers many widely used statistical models. It fits the data generating mechanisms well in many real problems, which makes it useful in a variety of applications where regression models are regularly employed. In low dimensions, rank-based M-estimators are recommended to deal with the generalized regression model, giving root-nn consistent estimators of β∗\beta^*. Applications of these estimators to high dimensional data, however, are questionable. This article studies, both theoretically and practically, a simple yet powerful smoothing approach to handle the high dimensional generalized regression model. Theoretically, a family of smoothing functions is provided, and the amount of smoothing necessary for efficient inference is carefully calculated. Practically, our study is motivated by an important and challenging scientific problem: decoding gene regulation by predicting transcription factors that bind to cis-regulatory elements. Applying our proposed method to this problem shows substantial improvement over the state-of-the-art alternative in real data.Comment: 53 page

    SEMIPARAMETRIC METHODS TO IMPROVE RISK ASSESSMENT AND DYNAMIC PREDICTION

    Get PDF
    Incorporating promising biomarkers to improve risk assessment and prediction is the central goal in many biomedical studies. Cost-effective designs and longitudinal designs are often utilized for measuring biomarker information, but they pose challenges to the data analyses. Statistical analyses for these kinds of data are routinely performed using parametric models. When the model assumptions are violated, parametric models may lead to substantial bias in parameter estimation, risk evaluation and prediction. In this dissertation, we will develop robust, exible statistical methods for risk assessment for matched case-control, nested case-control, and case-cohort designs, as well as a dynamic prediction tool for longitudinal data. In the first aim, we will develop a distribution-free method for identifying an optimal combination of biomarkers to differentiate cases and controls in matched case-control data. In the second aim, we will develop a semiparametric regression model with minimal assumptions on the link function for data from two-phase sampling designs with binary outcomes. In the third aim, we will develop a model-free dynamic prediction method for a survival outcome that provides dynamically updated risk scores using longitudinal biomarker(s)

    Penalized log-likelihood estimation for partly linear transformation models with current status data

    Full text link
    We consider partly linear transformation models applied to current status data. The unknown quantities are the transformation function, a linear regression parameter and a nonparametric regression effect. It is shown that the penalized MLE for the regression parameter is asymptotically normal and efficient and converges at the parametric rate, although the penalized MLE for the transformation function and nonparametric regression effect are only n1/3n^{1/3} consistent. Inference for the regression parameter based on a block jackknife is investigated. We also study computational issues and demonstrate the proposed methodology with a simulation study. The transformation models and partly linear regression terms, coupled with new estimation and inference techniques, provide flexible alternatives to the Cox model for current status data analysis.Comment: Published at http://dx.doi.org/10.1214/009053605000000444 in the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org
    • …
    corecore