134 research outputs found

    Sample size and robust marginal methods for cluster-randomized trials with censored event times

    Get PDF
    This is the peer reviewed version of the following article: Zhong Yujie, and Cook Richard J. (2015), Sample size and robust marginal methods for cluster-randomized trials with censored event times, Statist. Med., 34, pages 901–923. doi: 10.1002/sim.6395, which has been published in final form at http://dx.doi.org/10.1002/sim.6395. This article may be used for non-commercial purposes in accordance With Wiley Terms and Conditions for self-archiving.In cluster-randomized trials, intervention effects are often formulated by specifying marginal models, fitting them under a working independence assumption, and using robust variance estimates to address the association in the responses within clusters. We develop sample size criteria within this framework, with analyses based on semiparametric Cox regression models fitted with event times subject to right censoring. At the design stage, copula models are specified to enable derivation of the asymptotic variance of estimators from a marginal Cox regression model and to compute the number of clusters necessary to satisfy power requirements. Simulation studies demonstrate the validity of the sample size formula in finite samples for a range of cluster sizes, censoring rates and degrees of within-cluster association among event times. The power and relative efficiency implications of copula misspecification is studied, as well as the effect of within-cluster dependence in the censoring times. Sample size criteria and other design issues are also addressed for the setting where the event status is only ascertained at periodic assessments and times are interval censored.Natural Sciences and Engineering Research Council of Canada (RGPIN 155849); Canadian Institutes for Health Research (FRN 13887); Canada Research Chair (Tier 1) – CIHR funded (950-226626

    Life History Analysis with Response-Dependent Observation

    Get PDF
    This thesis deals with statistical issues in the analysis of dependent failure time data under complex observation schemes. These observation schemes may yield right-censored, interval-censored and current status data and may also involve response-dependent selection of individuals. The contexts in which these complications arise include family studies, clinical trials, and population studies. Chapter 2 is devoted to the development and study of statistical methods for family studies, motivated by work conducted in the Centre for Prognosis Studies in the Rheumatic Disease at the University of Toronto. Rheumatologists at this centre are interested in studying the nature of within-family dependence in the occurrence of psoriatic arthritis (PsA) to gain insight into the genetic basis for this disease. Families are sampled by selecting members from a clinical registry of PsA patients maintained at the centre and recruiting their respective consenting family members; the member of the registry leading to the sampling of the family is called the proband. Information on the disease onset time for non-probands may be collected by recall or a review of medical records, but some non-probands simply provide their disease status at the time of assessment. As a result family members may provide a combination of observed or right-censored onset times, and current status information. Gaussian copula-based models are studied as a means of flexibly characterizing the within-family association in disease onset times. Likelihood and composite likelihood procedures are also investigated where the latter, like the estimating function approach, reduces the need to specify high-order dependencies and computational burden. Valid analysis of this type of data must address the response-biased sampling scheme which renders at least one affected family member (proband) with a right-truncated onset time. This right-truncation scheme, combined with the low incidence of disease among non-probands, means there is little information about the marginal onset time distribution from the family data alone, so we exploit auxiliary data from an independent sample of independent individuals to enhance the information on the parameters in the marginal age of onset distribution. For composite likelihood approaches, we consider simultaneous and two-stage estimation procedures; the latter greatly simplified the computational burden, especially when weakly, semi- or non-parametric marginal models are adopted. The proposed models and methods are examined in simulation studies and are applied to data from the PsA family study yielding important insight regarding the parent of origin hypothesis. Cluster-randomized trials are employed when it is appropriate on ethical, practical, or contextual grounds to assign groups of individuals to receive one of two or more interventions to be compared. This design also offers a way of minimizing contamination across treatment groups and enhancing compliance. Although considerable attention has been directed at the development of sample size formulae for cluster-randomized trials with continuous or discrete outcomes, relatively little work has been done for trials involving censored event times. In Chapter 3, asymptotic theory for sample size calculations for correlated failure time data arising in cluster-randomized trials is explored. When the intervention effect is specified through a semi-parametric proportional hazards model fitted under a working independence assumption, robust variance estimates are routinely used. At the design stage however, some model specification is required for the marginal distributions, and copula models are utilized to accommodate the within-cluster dependence. This method is appealing since the intervention effects are specified in terms of the marginal proportional hazards formulation while the within-cluster dependence is modeled by a separate association parameter. The resulting joint model enabled one to evaluate the robust sandwich variance, based on which the sample size criteria for right censored event times is developed. This approach has also been extended to deal with interval-censored event times and within-cluster dependence in the random right censoring times. The validity of the sample size formula in finite samples was investigated via simulation for a range of cluster sizes, censoring rates and degree of within-cluster association among event times. The power and efficiency implications of copula misspecification are studied, along with the effect of within-cluster dependence in the censoring times. The proposed sample size formula can be applied in a broad range of practical settings, and an application to a study of otitis media is given for illustration. Chapter 4 considers dependent failure time data in a slightly different context where the events correspond to transitions in a multistate model. A central goal in oncology is the reduction of mortality due to cancer. The therapeutic advances in the treatment of many cancers and the increasing pressure to ensure experimental treatments are evaluated in a timely and cost-effective manner, have made it challenging to design feasible trials with adequate power to detect clinically important effects based on the time from randomization to death. This has lead to increased use of the composite endpoint of progression-free survival, defined as the time from randomization to the first of progression or death. While trials may be designed with progression or progression-free survival as the primary endpoint, regulators are interested in statements about the effect of treatment on survival following progression. One approach to investigate this is to estimate the treatment effect on the time from progression to death, but this is not an analysis that benefits from randomization since the only individuals who contribute to this analysis are those that experienced progression. Also assessing the treatment effect on marginal features might lead to dependent censoring for the survival time following progression as other variables which have both effect on progression and post-progression survival time are omitted from the model. In Chapter 4 we consider a classical illness-death model which can be used to characterize the joint distribution of progression and death in this setting. Inverse probability weighting can then be used to address for the observational nature of this improper sub-group analysis and dependent censoring. Such inverse weighted equations yield consistent estimates of the causal treatment effect by accounting for the effect of treatment and any prognostic factors that may be shared between the model for the sojourn time distribution in the progression state and the transition intensity for progression. Due to the non-collapsibility of the Cox regression model we focus here on additive regression models. Chapter 5 discusses prevalent cohort studies and the problem of measurement error in the reported disease onset time along with other topics for further research

    Combining Multiple Survival Endpoints within a Single Statistical Analysis.

    Get PDF
    The aim of this thesis is to develop methodology for combining multiple endpoints within a single statistical analysis that compares the responses of patients treated with a novel treatment with those of control patients treated conventionally. The focus is on interval-censored bivariate survival data, and five real data sets from previous studies concerning multiple responses are used to illustrate the techniques developed. The background to survival analysis is introduced by a general description of survival data, and an overview of existing methods and underlying models is included. A review is given of two of the most popular survival analysis methods, namely the logrank test and Cox's proportional hazards model. The global score test methodology for combining multiple endpoints is described in detail, and application to real data demonstrates its benefits. The correlation between two score statistics arising from bivariate interval-censored survival data is the core of this research. The global score test methodology is extended to the case of bivariate interval-censored survival data and a complementary log-log link is applied to derive the covariance and the correlation between the two score statistics. A number of common scenarios are considered in this investigation and the accuracy of the estimator is evaluated by means of extensive simulations. An established method, namely the approach of Wei, Lin and Weissfeld, is examined and compared with the proposed method using both real and simulated data. It is concluded that our method is accurate, consistent and comparable to the competitor. This study marked the first successful development of the global score test methodology for bivariate survival data, employing a new approach to the derivation of the covariance between two score statistics on the basis of an interval-censored model. Additionally, the relationship between the jackknife technique and the Wei, Lin and Weissfeld method has been clarified

    Statistical Models and Methods for Dependent Life History Processes

    Get PDF
    This thesis deals with statistical issues in the analysis of complex life history processes which have characteristics of heterogeneity and dependence. We are motivated, in this thesis, by three specific types of processes; i) processes featuring recurrent episodic conditions ii) multi-type recurrent events, and iii) clustered multi state processes as arise in family studies. In chronic diseases featuring recurrent episodic conditions, symptom onset is followed by a period during which symptoms are present until recovery. In the analysis of data from such processes, analysis is often based only on the recurrent onset of disease, ignoring the duration of symptoms. This loss of information may lead to incorrect conclusions in the analysis of this data. In Chapter 2, we propose a novel model for an alternating two-state process including symptom-free state and symptomatic state to recognize the duration of symptoms. This approach reflects the dynamics of individual's disease process and helps to understand a course of disease. Intensity-based models with multiplicative random effects are considered where the disease onset time is governed by a conditionally Markov intensity and the time of recovery is governed by a conditionally semi-Markov intensity. A bivariate random effect with one multiplicative component for each intensity is introduced to accommodate between-individual heterogeneity and a dependence between bivariate random effect variables offers a natural and more general framework for modeling the two state process. A copula function is used for the joint distribution of random effects which retains the marginal features and gives flexible choices of dependence structure. The proposed model is a semiparametric model for which estimation is carried out using an expectation-maximization algorithm. The aforementioned problem leads us to investigate the impact of ignoring symptom duration in a randomized trial setting. In Chapter 3, we define two risk sets for recurrent event analyses: one involves including individuals during their symptomatic period, and the other excluding individuals from the risk set during symptomatic periods. In a clinical trial, the balance between treatment groups in unmeasured confounders present at the time of randomization can be lost following randomization as the risk set changes, thus, retaining individuals in the risk set is a common approach. Here we examine asymptotic and empirical biases of estimators from the rate-based models when two different risk sets are applied. We assume that the true underlying process is an alternating two-state process where the true risk set is the one that excludes individuals when they are experiencing an exacerbation. We consider two scenarios of the true model. First, there is no between-variation for each process and no dependence between two processes. The second scenario is to use the proposed dependent alternating two-states model in Chapter 2. Issues of model misspecification and causal inference are considered. When focus is on clinical trials, power implications of risk set misspecification is of interest. In Chapter 4, attention is directed at multiple recurrent events where each endpoint is of interest. The use of composite endpoint which is the time point of the first event of any type is a simple way to analyse such data. However, when multiple events are of comparable importance, use of a composite endpoint analysis may not be suitable. We propose a copula-based model for multi-type recurrent events where each type of recurrent event process arises from a mixed-Poisson model and random effects linking the events through a copula function. When more than two types of events are considered, composite likelihood is adopted to ease the computational burden, and simultaneous and two-stage estimation are explored. An aim of family studies is typically to gain knowledge about factors governing the inheritance of diseases. One may be interested in examining a dependence of disease onset between family members, and in identifying genetic markers associated with heritable disease. A common procedure is to collect families is through probands in which such affected individuals are selected from a disease registry and their family members (non-probands) are, then, recruited for examination. This approach to sampling families motivates us to consider the disease onset process along with survival since the proband must be diseased and alive to be recruited, and family members may need to be alive. In Chapter 5, we propose a model for a clustered illness-death process for family studies which accounts for the semi-competing risks problem for disease onset as well as biased sampling. We model within-family association in the age of disease onset via a copula function and applied to the possibly latent disease onset time and incorporate survival through a marginal illness-death model. The ascertainment condition is reflected in the likelihood or composite likelihood construction. Two study designs regarding the recruitment of family members are considered. One involves the collection of disease history from family members via the proband or medical records. The other requires family members to undergo a medical examination in which case they must be alive at the time of the family study. Family data alone are insufficient to estimate all of the parameters of the illness-death processes. We therefore make use of auxiliary data including the population mortality data and additional registry data to address the estimatability issue. Another source of auxiliary data is current status survey. The issue of missing genetic markers is also addressed in each study design

    Untangling hotel industry’s inefficiency: An SFA approach applied to a renowned Portuguese hotel chain

    Get PDF
    The present paper explores the technical efficiency of four hotels from Teixeira Duarte Group - a renowned Portuguese hotel chain. An efficiency ranking is established from these four hotel units located in Portugal using Stochastic Frontier Analysis. This methodology allows to discriminate between measurement error and systematic inefficiencies in the estimation process enabling to investigate the main inefficiency causes. Several suggestions concerning efficiency improvement are undertaken for each hotel studied.info:eu-repo/semantics/publishedVersio

    Reliability Analysis And Optimal Maintenance Planning For Repairable Multi-Component Systems Subject To Dependent Competing Risks

    Get PDF
    Modern engineering systems generally consist of multiple components that interact in a complex manner. Reliability analysis of multi-component repairable systems plays a critical role for system safety and cost reduction. Establishing reliability models and scheduling optimal maintenance plans for multi-component repairable systems, however, is still a big challenge when considering the dependency of component failures. Existing models commonly make prior assumptions, without statistical verification, as to whether different component failures are independent or not. In this dissertation, data-driven systematic methodologies to characterize component failure dependency of complex systems are proposed. In CHAPTER 2, a parametric reliability model is proposed to capture the statistical dependency among different component failures under partially perfect repair assumption. Based on the proposed model, statistical hypothesis tests are developed to test the dependency of component failures. In CHAPTER 3, two reliability models for multi-component systems with dependent competing risks under imperfect assumptions are proposed, i.e., generalized dependent latent age model and copula-based trend-renewal process model. The generalized dependent latent age model generalizes the partially perfect repair model by involving the extended virtual age concept. And the copula-based trend renewal process model utilizes multiple trend functions to transform the failure times from original time domain to a transformed time domain, in which the repair conditions can be treated as partially perfect. Parameter estimation methods for both models are developed. In CHAPTER 4, based on the generalized dependent latent age model, two periodic inspection-based maintenance polices are developed for a multi-component repairable system subject to dependent competing risks. The first maintenance policy assumes all the components are restored to as good as new once a failure detected, i.e., the whole system is replaced. The second maintenance policy considers the partially perfect repair, i.e., only the failed component can be replaced after detection of failures. Both the maintenance policies are optimized with the aim to minimize the expected average maintenance cost per unit time. The developed methodologies are demonstrated by using applications of real engineering systems
    • …
    corecore