239,982 research outputs found

    Multilevel modelling of event history data: comparing methods appropriate for large datasets

    Get PDF
    Abstract When analysing medical or public health datasets, it may often be of interest to measure the time until a particular pre-defined event occurs, such as death from some disease. As it is known that the health status of individuals living within the same area tends to be more similar than for individuals from different areas, event times of individuals from the same area may be correlated. As a result, multilevel models must be used to account for the clustering of individuals within the same geographical location. When the outcome is time until some event, multilevel event history models must be used. Although software does exist for fitting multilevel event history models, such as MLwiN, computational requirements mean that the use of these models is limited for large datasets. For example, to fit the proportional hazards model (PHM), the most commonly used event history model for modelling the effect of risk factors on event times, in MLwiN a Poisson model is fitted to a person-period dataset. The person-period dataset is created by rearranging the original dataset so that each individual has a line of data corresponding to every risk set they survive until either censoring or the event of interest occurs. When time is treated as a continuous variable so that each risk set corresponds to a distinct event time, as is the case for the PHM, the size of the person-period dataset can be very large. This presents a problem for those working in public health as datasets used for measuring and monitoring public health are typically large. Furthermore, individuals may be followed-up for a long period of time and this can also contribute to a large person-period dataset. A further complication is that interest may be in modelling a rare event, resulting in a high proportion of censored observations. This can also be problematic when estimating multilevel event history models. Since multilevel event history models are important in public health, the aim of this thesis is to develop these models so they can be fitted to large datasets considering, in particular, datasets with long periods of follow-up and rare events. Two datasets are used throughout the thesis to investigate three possible alternatives to fitting the multilevel proportional hazards model in MLwiN in order to overcome the problems discussed. The first is a moderately-sized Scottish dataset, which will be the main focus of the thesis, and is used as a ‘training dataset’ to explore the limitations of existing software packages for fitting multilevel event history models and also for investigating alternative methods. The second dataset, from Sweden, is used to test the effectiveness of each alternative method when fitted to a much larger dataset. The adequacy of the alternative methods are assessed on the following criteria: how effective they are at reducing the size of the person-period dataset, how similar parameter estimates obtained from using methods are compared to the PHM and how easy they are to implement. The first alternative method involves defining discrete-time risk sets and then estimating discrete-time hazard models via multilevel logistic regression models fitted to a person-period dataset. The second alternative method involves aggregating the data of individuals within the same higher-level units who have the same values for the covariates in a particular model. Aggregating the data like this means that one line of data is used to represent all such individuals since these individuals are at risk of experiencing the event of interest at the same time. This method is termed ‘grouping according to covariates’. Both continuous-time and discrete-time event history models can be fitted to the aggregated person-period dataset. The ‘grouping according to covariates’ method and the first method, which involves defining discrete-time risk sets, are both implemented in MLwiN and pseudo-likelihood methods of estimation are used. The third and final method to be considered, however, involves fitting Bayesian event history (frailty) models and using Markov chain Monte Carlo (MCMC) methods of estimation. These models are fitted in WinBUGS, a software package specially designed to make practical MCMC methods available to applied statisticians. In WinBUGS, an additive frailty model is adopted and a Weibull distribution is assumed for the survivor function. Methodological findings were that the discrete-time method led to a successful reduction in the continuous-time person-period dataset; however, it was necessary to experiment with the length of time intervals in order to have the widest interval without influencing parameter estimates. The grouping according to covariates method worked best when there were, on average, a larger number of individuals per higher-level unit, there were few risk factors in the model and little or none of the risk factors were continuous. The Bayesian method could be favourable as no data expansion is required to fit the Weibull model in WinBUGS and time is treated as a continuous variable. However, models took a much longer time to run using MCMC methods of estimation as opposed to likelihood methods. This thesis showed that it was possible to use a re-parameterised version of the Weibull model, as well as a variance expansion technique, to overcome slow convergence by reducing correlation in the Markov chains. This may be a more efficient way to reduce computing time than running further iterations

    Treatment Effect Quantification for Time-to-event Endpoints -- Estimands, Analysis Strategies, and beyond

    Full text link
    A draft addendum to ICH E9 has been released for public consultation in August 2017. The addendum focuses on two topics particularly relevant for randomized confirmatory clinical trials: estimands and sensitivity analyses. The need to amend ICH E9 grew out of the realization of a lack of alignment between the objectives of a clinical trial stated in the protocol and the accompanying quantification of the "treatment effect" reported in a regulatory submission. We embed time-to-event endpoints in the estimand framework, and discuss how the four estimand attributes described in the addendum apply to time-to-event endpoints. We point out that if the proportional hazards assumption is not met, the estimand targeted by the most prevalent methods used to analyze time-to-event endpoints, logrank test and Cox regression, depends on the censoring distribution. We discuss for a large randomized clinical trial how the analyses for the primary and secondary endpoints as well as the sensitivity analyses actually performed in the trial can be seen in the context of the addendum. To the best of our knowledge, this is the first attempt to do so for a trial with a time-to-event endpoint. Questions that remain open with the addendum for time-to-event endpoints and beyond are formulated, and recommendations for planning of future trials are given. We hope that this will provide a contribution to developing a common framework based on the final version of the addendum that can be applied to design, protocols, statistical analysis plans, and clinical study reports in the future.Comment: 37 page

    Autonomic State Management for Optimistic Simulation Platforms

    Get PDF
    We present the design and implementation of an autonomic state manager (ASM) tailored for integration within optimistic parallel discrete event simulation (PDES) environments based on the C programming language and the executable and linkable format (ELF), and developed for execution on x8664 architectures. With ASM, the state of any logical process (LP), namely the individual (concurrent) simulation unit being part of the simulation model, is allowed to be scattered on dynamically allocated memory chunks managed via standard API (e.g., malloc/free). Also, the application programmer is not required to provide any serialization/deserialization module in order to take a checkpoint of the LP state, or to restore it in case a causality error occurs during the optimistic run, or to provide indications on which portions of the state are updated by event processing, so to allow incremental checkpointing. All these tasks are handled by ASM in a fully transparent manner via (A) runtime identification (with chunk-level granularity) of the memory map associated with the LP state, and (B) runtime tracking of the memory updates occurring within chunks belonging to the dynamic memory map. The co-existence of the incremental and non-incremental log/restore modes is achieved via dual versions of the same application code, transparently generated by ASM via compile/link time facilities. Also, the dynamic selection of the best suited log/restore mode is actuated by ASM on the basis of an innovative modeling/optimization approach which takes into account stability of each operating mode with respect to variations of the model/environmental execution parameters
    • …
    corecore