375 research outputs found
Missing data in randomised controlled trials: a practical guide
Objective: Missing data are ubiquitous in clinical trials, yet recent research suggests many statisticians
and investigators appear uncertain how to handle them. The objective is to set out a principled
approach for handling missing data in clinical trials, and provide examples and code to facilitate
its adoption.
Data sources: An asthma trial from GlaxoSmithKline, a asthma trial from AstraZeneca, and a
dental pain trial from GlaxoSmithKline.
Methods: Part I gives a non-technical review how missing data are typically handled in clinical
trials, and the issues raised by missing data. When faced with missing data, we show no analysis
can avoid making additional untestable assumptions. This leads to a proposal for a systematic,
principled approach for handling missing data in clinical trials, which in turn informs a critique of
current Committee of Proprietary Medicinal Products guidelines for missing data, together with
many of the ad-hoc statistical methods currently employed.
Part II shows how primary analyses in a range of settings can be carried out under the so-called
missing at random assumption. This key assumption has a central role in underpinning the most
important classes of primary analysis, such as those based on likelihood. However its validity cannot
be assessed from the data under analysis, so in Part III, two main approaches are developed and
illustrated for the assessment of the sensitivity of the primary analyses to this assumption.
Results: The literature review revealed missing data are often ignored, or poorly handled in the
analysis. Current guidelines, and frequently used ad-hoc statistical methods are shown to be flawed.
A principled, yet practical, alternative approach is developed, which examples show leads inferences
with greater validity. SAS code is given to facilitate its direct application.
Conclusions: From the design stage onwards, a principled approach to handling missing data should
be adopted. Such an approach follows well-defined and accepted statistical arguments, using models
and assumptions that are transparent, and hence open to criticism and debate. This monograph
outlines how this principled approach can be practically, and directly, applied to the majority of
trials with longitudinal follow-up
Missing data in randomised controlled trials: a practical guide
Objective: Missing data are ubiquitous in clinical trials, yet recent research suggests many statisticians
and investigators appear uncertain how to handle them. The objective is to set out a principled
approach for handling missing data in clinical trials, and provide examples and code to facilitate
its adoption.
Data sources: An asthma trial from GlaxoSmithKline, a asthma trial from AstraZeneca, and a
dental pain trial from GlaxoSmithKline.
Methods: Part I gives a non-technical review how missing data are typically handled in clinical
trials, and the issues raised by missing data. When faced with missing data, we show no analysis
can avoid making additional untestable assumptions. This leads to a proposal for a systematic,
principled approach for handling missing data in clinical trials, which in turn informs a critique of
current Committee of Proprietary Medicinal Products guidelines for missing data, together with
many of the ad-hoc statistical methods currently employed.
Part II shows how primary analyses in a range of settings can be carried out under the so-called
missing at random assumption. This key assumption has a central role in underpinning the most
important classes of primary analysis, such as those based on likelihood. However its validity cannot
be assessed from the data under analysis, so in Part III, two main approaches are developed and
illustrated for the assessment of the sensitivity of the primary analyses to this assumption.
Results: The literature review revealed missing data are often ignored, or poorly handled in the
analysis. Current guidelines, and frequently used ad-hoc statistical methods are shown to be flawed.
A principled, yet practical, alternative approach is developed, which examples show leads inferences
with greater validity. SAS code is given to facilitate its direct application.
Conclusions: From the design stage onwards, a principled approach to handling missing data should
be adopted. Such an approach follows well-defined and accepted statistical arguments, using models
and assumptions that are transparent, and hence open to criticism and debate. This monograph
outlines how this principled approach can be practically, and directly, applied to the majority of
trials with longitudinal follow-up
Multiple imputation methods for bivariate outcomes in cluster randomised trials.
Missing observations are common in cluster randomised trials. The problem is exacerbated when modelling bivariate outcomes jointly, as the proportion of complete cases is often considerably smaller than the proportion having either of the outcomes fully observed. Approaches taken to handling such missing data include the following: complete case analysis, single-level multiple imputation that ignores the clustering, multiple imputation with a fixed effect for each cluster and multilevel multiple imputation. We contrasted the alternative approaches to handling missing data in a cost-effectiveness analysis that uses data from a cluster randomised trial to evaluate an exercise intervention for care home residents. We then conducted a simulation study to assess the performance of these approaches on bivariate continuous outcomes, in terms of confidence interval coverage and empirical bias in the estimated treatment effects. Missing-at-random clustered data scenarios were simulated following a full-factorial design. Across all the missing data mechanisms considered, the multiple imputation methods provided estimators with negligible bias, while complete case analysis resulted in biased treatment effect estimates in scenarios where the randomised treatment arm was associated with missingness. Confidence interval coverage was generally in excess of nominal levels (up to 99.8%) following fixed-effects multiple imputation and too low following single-level multiple imputation. Multilevel multiple imputation led to coverage levels of approximately 95% throughout. © 2016 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd
A penalized framework for distributed lag non-linear models.
: Distributed lag non-linear models (DLNMs) are a modelling tool for describing potentially non-linear and delayed dependencies. Here, we illustrate an extension of the DLNM framework through the use of penalized splines within generalized additive models (GAM). This extension offers built-in model selection procedures and the possibility of accommodating assumptions on the shape of the lag structure through specific penalties. In addition, this framework includes, as special cases, simpler models previously proposed for linear relationships (DLMs). Alternative versions of penalized DLNMs are compared with each other and with the standard unpenalized version in a simulation study. Results show that this penalized extension to the DLNM class provides greater flexibility and improved inferential properties. The framework exploits recent theoretical developments of GAMs and is implemented using efficient routines within freely available software. Real-data applications are illustrated through two reproducible examples in time series and survival analysis.<br/
Multiple imputation for IPD meta-analysis: allowing for heterogeneity and studies with missing covariates.
Recently, multiple imputation has been proposed as a tool for individual patient data meta-analysis with sporadically missing observations, and it has been suggested that within-study imputation is usually preferable. However, such within study imputation cannot handle variables that are completely missing within studies. Further, if some of the contributing studies are relatively small, it may be appropriate to share information across studies when imputing. In this paper, we develop and evaluate a joint modelling approach to multiple imputation of individual patient data in meta-analysis, with an across-study probability distribution for the study specific covariance matrices. This retains the flexibility to allow for between-study heterogeneity when imputing while allowing (i) sharing information on the covariance matrix across studies when this is appropriate, and (ii) imputing variables that are wholly missing from studies. Simulation results show both equivalent performance to the within-study imputation approach where this is valid, and good results in more general, practically relevant, scenarios with studies of very different sizes, non-negligible between-study heterogeneity and wholly missing variables. We illustrate our approach using data from an individual patient data meta-analysis of hypertension trials. © 2015 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd
Bayesian models for weighted data with missing values: a bootstrap approach
Many data sets, especially from surveys, are made available to users with weights. Where the derivation of such weights is known, this information can often be incorporated in the user's substantive model (model of interest). When the derivation is unknown, the established procedure is to carry out a weighted analysis. However, with non‐trivial proportions of missing data this is inefficient and may be biased when data are not missing at random. Bayesian approaches provide a natural approach for the imputation of missing data, but it is unclear how to handle the weights. We propose a weighted bootstrap Markov chain Monte Carlo algorithm for estimation and inference. A simulation study shows that it has good inferential properties. We illustrate its utility with an analysis of data from the Millennium Cohort Study
A review of RCTs in four medical journals to assess the use of imputation to overcome missing data in quality of life outcomes
Peer reviewedPublisher PD
Reference based sensitivity analysis for longitudinal trials with protocol deviation via multiple imputation
Randomised controlled trials provide essential evidence for the evaluation of new and existing medical treatments. Unfortunately the statistical analysis is often complicated by the occurrence of protocol deviations, which mean we cannot always measure the intended outcomes for individuals who deviate, resulting in a missing data problem. In such settings, however one approaches the analysis, an untestable assumption about the distribution of the unobserved data must be made. To understand how far the results depend on these assumptions, the primary analysis should be supplemented by a range of sensitivity analyses, which explore how the conclusions vary over a range of different credible assumptions for the missing data. In this article we describe a new command, mimix, that can be used to perform reference based sensitivity analyses for randomised controlled trials with longitudinal quantitative outcome data, using the approach proposed by Carpenter, Roger, and Kenward (2013). Under this approach, we make qualitative assumptions about how individuals' missing outcomes relate to those observed in relevant groups in the trial, based on plausible clinical scenarios. Statistical analysis then proceeds using the method of multiple imputation
A re-randomisation design for clinical trials
Background: Recruitment to clinical trials is often problematic, with many trials failing to recruit to their target sample size. As a result, patient care may be based on suboptimal evidence from underpowered trials or non-randomised studies. Methods: For many conditions patients will require treatment on several occasions, for example, to treat symptoms of an underlying chronic condition (such as migraines, where treatment is required each time a new episode occurs), or until they achieve treatment success (such as fertility, where patients undergo treatment on multiple occasions until they become pregnant). We describe a re-randomisation design for these scenarios, which allows each patient to be independently randomised on multiple occasions. We discuss the circumstances in which this design can be used. Results: The re-randomisation design will give asymptotically unbiased estimates of treatment effect and correct type I error rates under the following conditions: (a) patients are only re-randomised after the follow-up period from their previous randomisation is complete; (b) randomisations for the same patient are performed independently; and (c) the treatment effect is constant across all randomisations. Provided the analysis accounts for correlation between observations from the same patient, this design will typically have higher power than a parallel group trial with an equivalent number of observations. Conclusions: If used appropriately, the re-randomisation design can increase the recruitment rate for clinical trials while still providing an unbiased estimate of treatment effect and correct type I error rates. In many situations, it can increase the power compared to a parallel group design with an equivalent number of observations
Information anchored reference‐based sensitivity analysis for truncated normal data with application to survival analysis
The primary analysis of time-to-event data typically makes the censoring at random assumption, that is, that—conditional on covariates in the model—the distribution of event times is the same, whether they are observed or unobserved. In such cases, we need to explore the robustness of inference to more pragmatic assumptions about patients post-censoring in sensitivity analyses. Reference-based multiple imputation, which avoids analysts explicitly specifying the parameters of the unobserved data distribution, has proved attractive to researchers. Building on results for longitudinal continuous data, we show that inference using a Tobit regression imputation model for reference-based sensitivity analysis with right censored log normal data is information anchored, meaning the proportion of information lost due to missing data under the primary analysis is held constant across the sensitivity analyses. We illustrate our theoretical results using simulation and a clinical trial case study
- …
