400 research outputs found

    Regression analysis of correlated interval-censored failure time data with a cured subgroup

    Get PDF
    Interval-censored failure time data commonly occur in many periodic follow-up studies such as epidemiological experiments, medical studies and clinical trials. By intervalcensored data, we usually mean that one cannot observe the failure time of interest and instead we know that it belongs to a time interval. Correlated failure time data commonly occur when there are multiple events on one individual or when the study subjects are clustered into some small groups. In this situation, study subjects from same subgroup or the failure events from same individuals are usually regarded as dependent, but the subjects in different clusters or failure events from different individuals are assumed to be independent. Besides the correlation between the cluster, sometimes the cluster size may be informative or carry some information about the failure time of interest. Cured subgroup is another interesting topic that has been discussed by many authors. For this situation, unlike the assumptions in traditional survival model that all study subjects would experience the failure event of interest eventually if the follow-up time is long enough, some subjects may never experience or not be susceptible to the event. Such subjects are treated as cured and assumed to belong to a cured subgroup in a study population. The research in this dissertation focuses on regression analysis of correlated intervalcensored data with a cured subgroup via different approaches based on different data structures. In the first part of this dissertation, we discuss clustered interval-censored data with a cured subgroup and informative cluster size. To address this, we present a within-cluster resampling method and in the approach, the multiple imputation procedure is applied for estimation of unknown parameters. To assess the performance of the proposed method, a simulation study is conducted and suggests that it works well in practical situations. Also, the method is applied to a set of real data that motivated this study. In the second part of this dissertation, we consider the clustered interval-censored data with a cured subgroup via a non-mixture cure model. We present a maximum likelihood estimation procedure under the semiparametric transformation nonmixture cure model. To estimate the unknown parameters, an expectation maximization (EM) algorithm based on an augmentation of Poisson variable is developed. To assess the performance of the proposed method, a simulation study is conducted and suggests that it works well in practical situations. An application to a study conducted by the National Aeronautics and Space Administration that motivated this study is also provided. In the third part of this dissertation, we investigate the bivariate interval-censored data with a cured subgroup. A sieve maximum likelihood estimation procedure under the semiparametric transformation non-mixture cure model based on Bernstein polynomials is presented. A simulation study is conducted to assess the finite sample performance of the proposed method and suggests that the proposed model works well. Also, a real data application from the study of AIDS Clinical Trial Group 181 is provided

    A semiparametric Bayesian proportional hazards model for interval censored data with frailty effects

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Multivariate analysis of interval censored event data based on classical likelihood methods is notoriously cumbersome. Likelihood inference for models which additionally include random effects are not available at all. Developed algorithms bear problems for practical users like: matrix inversion, slow convergence, no assessment of statistical uncertainty.</p> <p>Methods</p> <p>MCMC procedures combined with imputation are used to implement hierarchical models for interval censored data within a Bayesian framework.</p> <p>Results</p> <p>Two examples from clinical practice demonstrate the handling of clustered interval censored event times as well as multilayer random effects for inter-institutional quality assessment. The software developed is called survBayes and is freely available at CRAN.</p> <p>Conclusion</p> <p>The proposed software supports the solution of complex analyses in many fields of clinical epidemiology as well as health services research.</p

    A Gamma-frailty proportional hazards model for bivariate interval-censored data

    Get PDF
    Correlated survival data naturally arise from many clinical and epidemiological studies. For the analysis of such data, the Gamma-frailty proportional hazards (PH) model is a popular choice because the regression parameters have marginal interpretations and the statistical association between the failure times can be explicitly quantified via Kendall’s tau. Despite their popularity, Gamma-frailty PH models for correlated interval-censored data have not received as much attention as analogous models for right-censored data. A Gamma-frailty PH model for bivariate interval-censored data is presented and an easy to implement expectation–maximization (EM) algorithm for model fitting is developed. The proposed model adopts a monotone spline representation for the purposes of approximating the unknown conditional cumulative baseline hazard functions, significantly reducing the number of unknown parameters while retaining modeling flexibility. The EM algorithm was derived from a data augmentation procedure involving latent Poisson random variables. Extensive numerical studies illustrate that the proposed method can provide reliable estimation and valid inference, and is moreover robust to the misspecification of the frailty distribution. To further illustrate its use, the proposed method is used to analyze data from an epidemiological study of sexually transmitted infections

    Inference for time to event and sojourn time data under right censoring using reweighting approaches.

    Get PDF
    In this dissertation research, we aim to solve problems of two types of survival data, clustered survival data with potentially informative cluster size and sojourn time data. The methods for these two types of data are different. However, both data have right censored observations, and we use reweighting approaches to deal with the censoring issue. In the first part of the dissertation research, we consider marginal AFT models for correlated survival data with potentially informative cluster size. Informative cluster size means that the size of the correlated groups may be predictive of their survival characteristics. Two competing proposals, cluster-weighted AFT (CWAFT) marginal model and non-cluster-weighted AFT (NCW AFT) marginal model, are investigated. Simulation and theoretical results show that the CW AFT approach produces unbiased parameter estimation, but that the NCWAFT model does not when the cluster size is informative. We use probability-probability plots to investigate statistical properties of confidence intervals and adopt Wald tests to examine power properties for the CW AFT model. To illustrate our analysis, we apply the CWAFT model to a dental study data set. In the second part of the dissertation research, we consider the problem of comparing sojourn time distributions of a transient state in a general multi state system in two samples (groups) when the transition times are right censored. Under this setup, the censoring induced on the weight times is complex since both the state entry and exit are subjected to right censoring. Using the reweighting principle, a two sample MannWhitney type U-statistic is constructed that compares only the uncensored state sojourn times from the two distributions. A second Mann-Whitney type statistic is also constructed using a different reweighting that allows for comparison when one of the two sojourn times is either uncensored or singly censored. While both statistics are asymptotically unbiased and reduce to the standard Mann-Whitney statistic when there is no censoring, the second statistic has smaller variance since it effectively uses larger pairs of samples. Asymptotic normality of these statistics are established. A test of comparing the equality of sojourn time distributions in two independent samples is constructed by symmetrizing the pair specific Mann-Whitney type statistics mentioned above. The testing methodology is illustrated using a kidney disease patients data set

    Marginal Proportional Hazards Models for Clustered Interval-Censored Data with Time-Dependent Covariates

    Get PDF
    The Botswana Combination Prevention Project was a cluster-randomized HIV prevention trial whose follow-up period coincided with Botswana’s national adoption of a universal test-and-treat strategy for HIV management. Of interest is whether, and to what extent, this change in policy (i) modified the observed preventative effects of the study intervention and (ii) was associated with a reduction in the population-level incidence of HIV in Botswana. To address these questions, we propose a stratified proportional hazards model for clustered interval-censored data with time-dependent covariates and develop a composite expectation maximization algorithm that facilitates estimation of model parameters without placing parametric assumptions on either the baseline hazard functions or the within-cluster dependence structure. We show that the resulting estimators for the regression parameters are consistent and asymptotically normal. We also propose and provide theoretical justification for the use of the profile composite likelihood function to construct a robust sandwich estimator for the variance. We characterize the finite-sample performance and robustness of these estimators through extensive simulation studies. Finally, we conclude by applying this stratified proportional hazards model to a re-analysis of the Botswana Combination Prevention Project, with the national adoption of a universal test-and-treat strategy now modeled as a time-dependent covariate

    Multivariate survival models for interval-censored udder quarter infection times

    Get PDF

    A semiparametric cure model for interval-censored data

    Get PDF
    postprin

    Bayesian semiparametric inference for multivariate doubly-interval-censored data

    Get PDF
    Based on a data set obtained in a dental longitudinal study, conducted in Flanders (Belgium), the joint time to caries distribution of permanent first molars was modeled as a function of covariates. This involves an analysis of multivariate continuous doubly-interval-censored data since: (i) the emergence time of a tooth and the time it experiences caries were recorded yearly, and (ii) events on teeth of the same child are dependent. To model the joint distribution of the emergence times and the times to caries, we propose a dependent Bayesian semiparametric model. A major feature of the proposed approach is that survival curves can be estimated without imposing assumptions such as proportional hazards, additive hazards, proportional odds or accelerated failure time.Comment: Published in at http://dx.doi.org/10.1214/10-AOAS368 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Multilevel modelling of event history data: comparing methods appropriate for large datasets

    Get PDF
    Abstract When analysing medical or public health datasets, it may often be of interest to measure the time until a particular pre-defined event occurs, such as death from some disease. As it is known that the health status of individuals living within the same area tends to be more similar than for individuals from different areas, event times of individuals from the same area may be correlated. As a result, multilevel models must be used to account for the clustering of individuals within the same geographical location. When the outcome is time until some event, multilevel event history models must be used. Although software does exist for fitting multilevel event history models, such as MLwiN, computational requirements mean that the use of these models is limited for large datasets. For example, to fit the proportional hazards model (PHM), the most commonly used event history model for modelling the effect of risk factors on event times, in MLwiN a Poisson model is fitted to a person-period dataset. The person-period dataset is created by rearranging the original dataset so that each individual has a line of data corresponding to every risk set they survive until either censoring or the event of interest occurs. When time is treated as a continuous variable so that each risk set corresponds to a distinct event time, as is the case for the PHM, the size of the person-period dataset can be very large. This presents a problem for those working in public health as datasets used for measuring and monitoring public health are typically large. Furthermore, individuals may be followed-up for a long period of time and this can also contribute to a large person-period dataset. A further complication is that interest may be in modelling a rare event, resulting in a high proportion of censored observations. This can also be problematic when estimating multilevel event history models. Since multilevel event history models are important in public health, the aim of this thesis is to develop these models so they can be fitted to large datasets considering, in particular, datasets with long periods of follow-up and rare events. Two datasets are used throughout the thesis to investigate three possible alternatives to fitting the multilevel proportional hazards model in MLwiN in order to overcome the problems discussed. The first is a moderately-sized Scottish dataset, which will be the main focus of the thesis, and is used as a ‘training dataset’ to explore the limitations of existing software packages for fitting multilevel event history models and also for investigating alternative methods. The second dataset, from Sweden, is used to test the effectiveness of each alternative method when fitted to a much larger dataset. The adequacy of the alternative methods are assessed on the following criteria: how effective they are at reducing the size of the person-period dataset, how similar parameter estimates obtained from using methods are compared to the PHM and how easy they are to implement. The first alternative method involves defining discrete-time risk sets and then estimating discrete-time hazard models via multilevel logistic regression models fitted to a person-period dataset. The second alternative method involves aggregating the data of individuals within the same higher-level units who have the same values for the covariates in a particular model. Aggregating the data like this means that one line of data is used to represent all such individuals since these individuals are at risk of experiencing the event of interest at the same time. This method is termed ‘grouping according to covariates’. Both continuous-time and discrete-time event history models can be fitted to the aggregated person-period dataset. The ‘grouping according to covariates’ method and the first method, which involves defining discrete-time risk sets, are both implemented in MLwiN and pseudo-likelihood methods of estimation are used. The third and final method to be considered, however, involves fitting Bayesian event history (frailty) models and using Markov chain Monte Carlo (MCMC) methods of estimation. These models are fitted in WinBUGS, a software package specially designed to make practical MCMC methods available to applied statisticians. In WinBUGS, an additive frailty model is adopted and a Weibull distribution is assumed for the survivor function. Methodological findings were that the discrete-time method led to a successful reduction in the continuous-time person-period dataset; however, it was necessary to experiment with the length of time intervals in order to have the widest interval without influencing parameter estimates. The grouping according to covariates method worked best when there were, on average, a larger number of individuals per higher-level unit, there were few risk factors in the model and little or none of the risk factors were continuous. The Bayesian method could be favourable as no data expansion is required to fit the Weibull model in WinBUGS and time is treated as a continuous variable. However, models took a much longer time to run using MCMC methods of estimation as opposed to likelihood methods. This thesis showed that it was possible to use a re-parameterised version of the Weibull model, as well as a variance expansion technique, to overcome slow convergence by reducing correlation in the Markov chains. This may be a more efficient way to reduce computing time than running further iterations
    • …
    corecore