Search CORE

Missing data in randomised controlled trials: a practical guide

Author: Carpenter JR
Kenward MG
Publication venue: Health Technology Assessment Methodology Programme
Publication date: 01/01/2007
Field of study

Multiple imputation methods for bivariate outcomes in cluster randomised trials.

Author: DiazOrdaz K
Gomes M
Grieve R
Kenward MG
Publication venue: 'Wiley'
Publication date: 14/03/2016
Field of study

Missing observations are common in cluster randomised trials. The problem is exacerbated when modelling bivariate outcomes jointly, as the proportion of complete cases is often considerably smaller than the proportion having either of the outcomes fully observed. Approaches taken to handling such missing data include the following: complete case analysis, single-level multiple imputation that ignores the clustering, multiple imputation with a fixed effect for each cluster and multilevel multiple imputation. We contrasted the alternative approaches to handling missing data in a cost-effectiveness analysis that uses data from a cluster randomised trial to evaluate an exercise intervention for care home residents. We then conducted a simulation study to assess the performance of these approaches on bivariate continuous outcomes, in terms of confidence interval coverage and empirical bias in the estimated treatment effects. Missing-at-random clustered data scenarios were simulated following a full-factorial design. Across all the missing data mechanisms considered, the multiple imputation methods provided estimators with negligible bias, while complete case analysis resulted in biased treatment effect estimates in scenarios where the randomised treatment arm was associated with missingness. Confidence interval coverage was generally in excess of nominal levels (up to 99.8%) following fixed-effects multiple imputation and too low following single-level multiple imputation. Multilevel multiple imputation led to coverage levels of approximately 95% throughout. © 2016 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd

A penalized framework for distributed lag non-linear models.

Author: Armstrong B
Gasparrini A
Kenward MG
Scheipl F
Publication venue: 'Wiley'
Publication date: 30/01/2017
Field of study

: Distributed lag non-linear models (DLNMs) are a modelling tool for describing potentially non-linear and delayed dependencies. Here, we illustrate an extension of the DLNM framework through the use of penalized splines within generalized additive models (GAM). This extension offers built-in model selection procedures and the possibility of accommodating assumptions on the shape of the lag structure through specific penalties. In addition, this framework includes, as special cases, simpler models previously proposed for linear relationships (DLMs). Alternative versions of penalized DLNMs are compared with each other and with the standard unpenalized version in a simulation study. Results show that this penalized extension to the DLNM class provides greater flexibility and improved inferential properties. The framework exploits recent theoretical developments of GAMs and is implemented using efficient routines within freely available software. Real-data applications are illustrated through two reproducible examples in time series and survival analysis.<br/

Multiple imputation for IPD meta-analysis: allowing for heterogeneity and studies with missing covariates.

Author: Gueyffier F
Kenward MG
Quartagno M
R Core Team
Riley R
Zhao JH
Publication venue: 'Wiley'
Publication date: 17/12/2015
Field of study

Recently, multiple imputation has been proposed as a tool for individual patient data meta-analysis with sporadically missing observations, and it has been suggested that within-study imputation is usually preferable. However, such within study imputation cannot handle variables that are completely missing within studies. Further, if some of the contributing studies are relatively small, it may be appropriate to share information across studies when imputing. In this paper, we develop and evaluate a joint modelling approach to multiple imputation of individual patient data in meta-analysis, with an across-study probability distribution for the study specific covariance matrices. This retains the flexibility to allow for between-study heterogeneity when imputing while allowing (i) sharing information on the covariance matrix across studies when this is appropriate, and (ii) imputing variables that are wholly missing from studies. Simulation results show both equivalent performance to the within-study imputation approach where this is valid, and good results in more general, practically relevant, scenarios with studies of very different sizes, non-negligible between-study heterogeneity and wholly missing variables. We illustrate our approach using data from an individual patient data meta-analysis of hypertension trials. © 2015 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd

Bayesian models for weighted data with missing values: a bootstrap approach

Author: Carpenter J
Goldstein H
Kenward MG
Publication venue: 'Royal College of Obstetricians & Gynaecologists (RCOG)'
Publication date: 01/08/2018
Field of study

Many data sets, especially from surveys, are made available to users with weights. Where the derivation of such weights is known, this information can often be incorporated in the user's substantive model (model of interest). When the derivation is unknown, the established procedure is to carry out a weighted analysis. However, with non‐trivial proportions of missing data this is inefficient and may be biased when data are not missing at random. Bayesian approaches provide a natural approach for the imputation of missing data, but it is unclear how to handle the weights. We propose a weighted bootstrap Markov chain Monte Carlo algorithm for estimation and inference. A simulation study shows that it has good inferential properties. We illustrate its utility with an analysis of data from the Millennium Cohort Study

Aberdeen University Research

A review of RCTs in four medical journals to assess the use of imputation to overcome missing data in quality of life outcomes

Author: AM Wood
AR Donders
B Winblad
BG Feagan
C Ballard
Craig R Ramsay
D Curran
D Moher
DL Fairclough
EM Hunkeler
G Molenberghs
GL Gadbury
Graeme Maclennan
J Fairbank
JA Blumenthal
JG Wright
Jonathan A Cook
JR Carpenter
JR Korzenik
JR Ware
KJ Thomas
KS Nair
L Petersen
L- Yu
LL Hsieh
M Buszewicz
M Liu
MA Berry
MG Kenward
PM Fayers
R Brooks
RC Petersen
RJ McManus
RJA Little
SA Kaplan
Shona Fielding
SJ Meggitt
T Kennedy
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2008
Field of study

Peer reviewedPublisher PD

Springer - Publisher Connector

Directory of Open Access Journals

Oxford University Research Archive

Reference based sensitivity analysis for longitudinal trials with protocol deviation via multiple imputation

Author: Carpenter JR
Cro S
Kenward MG
Morris TP
Publication venue
Publication date: 01/01/2016
Field of study

Randomised controlled trials provide essential evidence for the evaluation of new and existing medical treatments. Unfortunately the statistical analysis is often complicated by the occurrence of protocol deviations, which mean we cannot always measure the intended outcomes for individuals who deviate, resulting in a missing data problem. In such settings, however one approaches the analysis, an untestable assumption about the distribution of the unobserved data must be made. To understand how far the results depend on these assumptions, the primary analysis should be supplemented by a range of sensitivity analyses, which explore how the conclusions vary over a range of different credible assumptions for the missing data. In this article we describe a new command, mimix, that can be used to perform reference based sensitivity analyses for randomised controlled trials with longitudinal quantitative outcome data, using the approach proposed by Carpenter, Roger, and Kenward (2013). Under this approach, we make qualitative assumptions about how individuals' missing outcomes relate to those observed in relevant groups in the trial, based on plausible clinical scenarios. Statistical analysis then proceeds using the method of multiple imputation

A re-randomisation design for clinical trials

Author: AJ Dunning
AM McDonald
Andrew B Forbes
AV Hernandez
B Carlisle
BC Kahan
BC Kahan
BC Kahan
BC Kahan
BC Kahan
BG Sully
Brennan C Kahan
C Haywood Jr
C Tudur Smith
CA DiazGranados
Caroline J Doré
EJ Mills
EL Turner
J Tack
KE Thorpe
KY Liang
LN Balaam
LN Yelland
M Nason
MG Kenward
OA Clark
PR Freeman
S Rabe-Hesketh
Tim P Morris
WF Rosenberger
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 05/11/2015
Field of study

Background: Recruitment to clinical trials is often problematic, with many trials failing to recruit to their target sample size. As a result, patient care may be based on suboptimal evidence from underpowered trials or non-randomised studies. Methods: For many conditions patients will require treatment on several occasions, for example, to treat symptoms of an underlying chronic condition (such as migraines, where treatment is required each time a new episode occurs), or until they achieve treatment success (such as fertility, where patients undergo treatment on multiple occasions until they become pregnant). We describe a re-randomisation design for these scenarios, which allows each patient to be independently randomised on multiple occasions. We discuss the circumstances in which this design can be used. Results: The re-randomisation design will give asymptotically unbiased estimates of treatment effect and correct type I error rates under the following conditions: (a) patients are only re-randomised after the follow-up period from their previous randomisation is complete; (b) randomisations for the same patient are performed independently; and (c) the treatment effect is constant across all randomisations. Provided the analysis accounts for correlation between observations from the same patient, this design will typically have higher power than a parallel group trial with an equivalent number of observations. Conclusions: If used appropriately, the re-randomisation design can increase the recruitment rate for clinical trials while still providing an unbiased estimate of treatment effect and correct type I error rates. In many situations, it can increase the power compared to a parallel group design with an equivalent number of observations

Springer - Publisher Connector

Queen Mary Research Online

Information anchored reference‐based sensitivity analysis for truncated normal data with application to survival analysis

Author: Atkinson A
Carpenter JR
Cro S
Kenward MG
Publication venue: 'Wiley'
Publication date: 01/01/2021
Field of study

The primary analysis of time-to-event data typically makes the censoring at random assumption, that is, that—conditional on covariates in the model—the distribution of event times is the same, whether they are observed or unobserved. In such cases, we need to explore the robustness of inference to more pragmatic assumptions about patients post-censoring in sensitivity analyses. Reference-based multiple imputation, which avoids analysts explicitly specifying the parameters of the unobserved data distribution, has proved attractive to researchers. Building on results for longitudinal continuous data, we show that inference using a Tobit regression imputation model for reference-based sensitivity analysis with right censored log normal data is information anchored, meaning the proportion of information lost due to missing data under the primary analysis is held constant across the sensitivity analyses. We illustrate our theoretical results using simulation and a clinical trial case study