4 research outputs found

    Semiparametric Regression Analysis of Panel Count Data and Interval-Censored Failure Time Data

    Get PDF
    This dissertation discusses three important research topics on semiparametric regression analysis of panel count data and interval-censored data. Both types of data arise commonly in real-life studies in many fields such as epidemiology, social science, and medical research. In these studies, subjects are usually examined multiple times at periodical or irregular follow-up examinations. For panel count data, the response variable is the counts of some recurrent events, whose exact occurrence times are usually unknown. For interval-censored data, the response variable is the time to some events of interest, often called survival time or failure time, and the exact response time is never observed but is known to fall within some interval formed by two examination times. The primary goal for both types of data is to study effects of covariates on the response variable and can be completed by regression analysis. Chapter 1 of this dissertation provides some detailed descriptions about panel count data and interval-censored data with several real-life examples. A literature review is conducted on existing approaches and commonly used semiparametric regression models for analyzing the two types of data. Some preliminary knowledge used in our approaches such as monotone splines and EM algorithm is also presented in this chapter. In Chapter 2, we propose a gamma frailty non-homogeneous Poisson process model for the regression analysis of panel count data to account for the within-subject correlation. This topic is important because ignoring such within-subject correlation results in biased estimation and may lead to misleading conclusions, and literature is limited on this topic. We propose an efficient estimation approach based on an EM algorithm. Our approach is robust to initial values, converges fast, and provides variance estimate in closed form. Our approach has shown an excellent performance in estimating both regression parameters and the baseline mean function when there is indeed within-subject correlation and can also be used when such correlation does not exist. An R package PCDSpline has been developed and available on CRAN to disseminate our approach. In Chapter 3, we study regression analysis of case 1 interval-censored data, also referred to as current status data, using the generalized odds-rate hazards (GORH) models. The GORH models are a general class of semiparametric models and have been widely used for analyzing right-censored data. However, their use for current status data is not found in the literature. We propose an efficient estimation approach with fixed p in the GORH models based on a novel EM algorithm. The proposed method is robust to initial values, fast to converge and provides variance estimates in closed form. A working model approach is proposed when true value of p is known but does not require to fit the GORH models with different p values. The proposed approach and working model strategy are evaluated and show good performance in an extensive simulation study. They are illustrated by a large real-life data set. In Chapter 4, we study the joint modeling of panel count data and intervalcensored failure time data motivated by a real-life data set about sexually transmitted infections (STI). The failure time of interest is the time to get a new STI since the enrollment, which has an interval-censored data structure. The other response variable is the number of unprotected sex over time, which has a panel count data structure. The proposed joint analysis based on an EM algorithm is more efficient than the univariate analysis of panel count data and interval-censored data separately. The proposed joint model and approach are applied to the STI data

    Semiparametric Regression Analysis of Survival Data and Panel Count Data

    Get PDF
    Both censored survival data and panel count data arise commonly in real-life studies in many fields such as epidemiology, social science, and medical research. In these studies, subjects are usually examined multiple times at periodical or irregular follow-up examinations. Censored data are studied when the exact failure times of the events are of interest but not all of these exact times are directly observed. Some of the failure times of event of interest are only known to fall within some intervals formed by the observation times. Panel count data are under investigation when the exact times of the recurrent events are not of interest but the counts of the recurrent events of interest occurring within the time intervals are available and of interest. This dissertation devotes to discussing three semiparametric regression models that can be used to analyze censored survival data and panel count data. Chapter 1 of this dissertation proposes an estimation approach for regression analysis of arbitrarily censored survival data under the proportional odds model. Arbitrarily censored data contains a mixture of exactly observed, left-censored, intervalcensored, and right-censored observations. Existing research work on regression analysis on arbitrarily censored data is sparse and limited to the proportional hazards model only. In this chapter, a novel estimation approach based on an EM algorithm is proposed for analyzing arbitrarily censored data under the proportional odds model. The proposed EM algorithm is robust to initial values, easy to implement, converging fast, and providing the variance estimate of the regression parameter estimate in closed form. This method has shown excellent performance in estimating the regression parameters as well as the baseline survival function in an extensive simulation study. Several real-life data applications are provided for illustration purpose. In Chapter 2, a novel Bayesian approach is proposed to analyze panel count data. The widely used gamma frailty Poisson process model has been shown to have good estimation performance and some robustness against misspecification of the frailty distribution but may still produce biased estimation in some cases when the gamma frailty assumption is violated. In this chapter, we tackle the problem by modeling the frailty distribution nonparametrically by adopting a Dirichlet Process Gamma Mixture (DPGM) prior for the frailty distribution. An easy-to-implement Gibbs sampler is developed to facilitate the Bayesian computation. The proposed Bayesian approach has an excellent performance in estimating the regression parameters and the baseline mean function in our simulation. It outperforms the gamma frailty Poisson model when the gamma frailty distribution is misspecified. The proposed method is applied to the famous bladder cancer data for illustration and comparison with existing methods. In Chapter 3, a novel unified Bayesian approach is developed for analyzing panel count data under the Gamma frailty Poisson process mode and interval-censored data under Cox’s proportional hazards model and the proportional odds model. The baseline functions in these models share the same property of being nondecreasing positive functions and are modeled nonparametrically by assigning a Gamma process prior. Efficient and easy-to-implement Gibbs samplers are developed for the posterior computation under these three models for the two types of data. The proposed methods are evaluated in extensive simulation studies and illustrated by real-life data applications

    Bayesian Semiparametric Methods for Analyzing Panel Count Data

    Get PDF
    Panel count data commonly arise in epidemiological, social science, medical studies, in which subjects have repeated measurements on the recurrent events of interest at different observation times. Since the subjects are not under continuous monitoring, the exact times of those recurrent events are not observed but the counts of such events within the adjacent observation times are known. Panel count data can be considered as a special type of longitudinal data with a count response variable in the literature. Compared to the frequentist literature, very limited Bayesian approaches have been developed to analyze panel count data. In this dissertation, several Bayesian estimation approaches are proposed for analyzing panel count data under different semiparametric regression models. Chapter 1 of this dissertation provides some description of panel count data, literature review on existing methods, and background knowledge of related tools used in the proposed methods. Chapter 2 proposes a Bayesian estimation approach under the Poisson proportional mean model, in which we model the baseline mean function with the monotone splines of Ramsay (1988) [1]. An efficient Gibbs sampler is proposed, all parameters can be either sampled directly from their full conditional distributions in standard forms or updated through automatic adaptive rejection sampling. Our proposed method is evaluated through extensive simulations and compared with two exiting methods. Our method is applied to a bladder cancer data set for illustration. Chapter 3 proposes a new Bayesian estimation approach for analyzing panel count data when there is heterogeneity in the population (that cannot be described by the available covariates). A frailty Poisson proportional mean model is proposed with the unobserved gamma frailties representing the heterogeneity among the subjects. Simulation studies suggest that our method not only has a good performance when such frailty exists but also provides robust estimation when there is no frailty. The bladder cancer tumor data is analyzed for illustration. Chapter 4 investigates the robustness of our proposed Bayesian approaches in Chapter 2 and Chapter 3 through simulations. We draw the conclusion that our proposed Bayesian methods still have a good performance in most cases when the assumptions are invalid
    corecore