5 research outputs found

    Modeling event count data in the presence of informative dropout with application to bleeding and transfusion events in myelodysplastic syndrome

    Get PDF
    In many biomedical studies, it is often of interest to model event count data over the study period. For some patients, we may not follow up them for the entire study period owing to informative dropout. The dropout time can potentially provide valuable insight on the rate of the events. We propose a joint semiparametric model for event count data and informative dropout time that allows for correlation through a Gamma frailty. We develop efficient likelihood-based estimation and inference procedures. The proposed nonparametric maximum likelihood estimators are shown to be consistent and asymptotically normal. Furthermore, the asymptotic covariances of the finite-dimensional parameter estimates attain the semiparametric efficiency bound. Extensive simulation studies demonstrate that the proposed methods perform well in practice. We illustrate the proposed methods through an application to a clinical trial for bleeding and transfusion events in myelodysplastic syndrome

    Semiparametric regression analysis of panel count data with informative observation times

    No full text
    This paper discusses regression analysis of panel count data that arise naturally when recurrent events are considered. For the analysis of panel count data, most of the existing methods have assumed that observation times are completely independent of recurrent events or given covariates, which may not be true in practice. We propose a joint modeling approach that uses an unobserved random variable and a completely unspecified link function to characterize the correlations between the response variable and the observation times. For inference about regression parameters, estimating equation approaches are developed without involving any estimation for latent variables, and the asymptotic properties of the resulting estimators are established. In addition, a technique is provided for assessing the adequacy of the model. The performance of the proposed estimation procedures are evaluated by means of Monte Carlo simulations, and a data set from a bladder tumor study is analyzed as an illustrative example.Estimating equation Informative observation times Mean function model Panel count data Regression analysis

    Statistical Methods for Life History Analysis Involving Latent Processes

    Get PDF
    Incomplete data often arise in the study of life history processes. Examples include missing responses, missing covariates, and unobservable latent processes in addition to right censoring. This thesis is on the development of statistical models and methods to address these problems as they arise in oncology and chronic disease. Methods of estimation and inference in parametric, weakly parametric and semiparametric settings are investigated. Studies of chronic diseases routinely sample individuals subject to conditions on an event time of interest. In epidemiology, for example, prevalent cohort studies aiming to evaluate risk factors for survival following onset of dementia require subjects to have survived to the point of screening. In clinical trials designed to assess the effect of experimental cancer treatments on survival, patients are required to survive from the time of cancer diagnosis to recruitment. Such conditions yield samples featuring left-truncated event time distributions. Incomplete covariate data often arise in such settings, but standard methods do not deal with the fact that the covariate distribution is also affected by left truncation. We develop a likelihood and algorithm for estimation for dealing with incomplete covariate data in such settings. An expectation-maximization algorithm deals with the left truncation by using the covariate distribution conditional on the selection criterion. An extension to deal with sub-group analyses in clinical trials is described for the case in which the stratification variable is incompletely observed. In studies of affective disorder, individuals are often observed to experience recurrent symptomatic exacerbations of symptoms warranting hospitalization. Interest lies in modeling the occurrence of such exacerbations over time and identifying associated risk factors to better understand the disease process. In some patients, recurrent exacerbations are temporally clustered following disease onset, but cease to occur after a period of time. We develop a dynamic mover-stayer model in which a canonical binary variable associated with each event indicates whether the underlying disease has resolved. An individual whose disease process has not resolved will experience events following a standard point process model governed by a latent intensity. If and when the disease process resolves, the complete data intensity becomes zero and no further events will arise. An expectation-maximization algorithm is developed for parametric and semiparametric model fitting based on a discrete time dynamic mover-stayer model and a latent intensity-based model of the underlying point process. The method is applied to a motivating dataset from a cohort of individuals with affective disorder experiencing recurrent hospitalization for their mental health disorder. Interval-censored recurrent event data arise when the event of interest is not readily observed but the cumulative event count can be recorded at periodic assessment times. Extensions on model fitting techniques for the dynamic mover-stayer model are discussed and incorporate interval censoring. The likelihood and algorithm for estimation are developed for piecewise constant baseline rate functions and are shown to yield estimators with small empirical bias in simulation studies. Data on the cumulative number of damaged joints in patients with psoriatic arthritis are analysed to provide an illustrative application

    Semiparametric Regression Analysis of Panel Count Data and Interval-Censored Failure Time Data

    Get PDF
    This dissertation discusses three important research topics on semiparametric regression analysis of panel count data and interval-censored data. Both types of data arise commonly in real-life studies in many fields such as epidemiology, social science, and medical research. In these studies, subjects are usually examined multiple times at periodical or irregular follow-up examinations. For panel count data, the response variable is the counts of some recurrent events, whose exact occurrence times are usually unknown. For interval-censored data, the response variable is the time to some events of interest, often called survival time or failure time, and the exact response time is never observed but is known to fall within some interval formed by two examination times. The primary goal for both types of data is to study effects of covariates on the response variable and can be completed by regression analysis. Chapter 1 of this dissertation provides some detailed descriptions about panel count data and interval-censored data with several real-life examples. A literature review is conducted on existing approaches and commonly used semiparametric regression models for analyzing the two types of data. Some preliminary knowledge used in our approaches such as monotone splines and EM algorithm is also presented in this chapter. In Chapter 2, we propose a gamma frailty non-homogeneous Poisson process model for the regression analysis of panel count data to account for the within-subject correlation. This topic is important because ignoring such within-subject correlation results in biased estimation and may lead to misleading conclusions, and literature is limited on this topic. We propose an efficient estimation approach based on an EM algorithm. Our approach is robust to initial values, converges fast, and provides variance estimate in closed form. Our approach has shown an excellent performance in estimating both regression parameters and the baseline mean function when there is indeed within-subject correlation and can also be used when such correlation does not exist. An R package PCDSpline has been developed and available on CRAN to disseminate our approach. In Chapter 3, we study regression analysis of case 1 interval-censored data, also referred to as current status data, using the generalized odds-rate hazards (GORH) models. The GORH models are a general class of semiparametric models and have been widely used for analyzing right-censored data. However, their use for current status data is not found in the literature. We propose an efficient estimation approach with fixed p in the GORH models based on a novel EM algorithm. The proposed method is robust to initial values, fast to converge and provides variance estimates in closed form. A working model approach is proposed when true value of p is known but does not require to fit the GORH models with different p values. The proposed approach and working model strategy are evaluated and show good performance in an extensive simulation study. They are illustrated by a large real-life data set. In Chapter 4, we study the joint modeling of panel count data and intervalcensored failure time data motivated by a real-life data set about sexually transmitted infections (STI). The failure time of interest is the time to get a new STI since the enrollment, which has an interval-censored data structure. The other response variable is the number of unprotected sex over time, which has a panel count data structure. The proposed joint analysis based on an EM algorithm is more efficient than the univariate analysis of panel count data and interval-censored data separately. The proposed joint model and approach are applied to the STI data
    corecore