371 research outputs found

    Missing data in randomised controlled trials: a practical guide

    Get PDF
    Objective: Missing data are ubiquitous in clinical trials, yet recent research suggests many statisticians and investigators appear uncertain how to handle them. The objective is to set out a principled approach for handling missing data in clinical trials, and provide examples and code to facilitate its adoption. Data sources: An asthma trial from GlaxoSmithKline, a asthma trial from AstraZeneca, and a dental pain trial from GlaxoSmithKline. Methods: Part I gives a non-technical review how missing data are typically handled in clinical trials, and the issues raised by missing data. When faced with missing data, we show no analysis can avoid making additional untestable assumptions. This leads to a proposal for a systematic, principled approach for handling missing data in clinical trials, which in turn informs a critique of current Committee of Proprietary Medicinal Products guidelines for missing data, together with many of the ad-hoc statistical methods currently employed. Part II shows how primary analyses in a range of settings can be carried out under the so-called missing at random assumption. This key assumption has a central role in underpinning the most important classes of primary analysis, such as those based on likelihood. However its validity cannot be assessed from the data under analysis, so in Part III, two main approaches are developed and illustrated for the assessment of the sensitivity of the primary analyses to this assumption. Results: The literature review revealed missing data are often ignored, or poorly handled in the analysis. Current guidelines, and frequently used ad-hoc statistical methods are shown to be flawed. A principled, yet practical, alternative approach is developed, which examples show leads inferences with greater validity. SAS code is given to facilitate its direct application. Conclusions: From the design stage onwards, a principled approach to handling missing data should be adopted. Such an approach follows well-defined and accepted statistical arguments, using models and assumptions that are transparent, and hence open to criticism and debate. This monograph outlines how this principled approach can be practically, and directly, applied to the majority of trials with longitudinal follow-up

    Missing data in randomised controlled trials: a practical guide

    Get PDF
    Objective: Missing data are ubiquitous in clinical trials, yet recent research suggests many statisticians and investigators appear uncertain how to handle them. The objective is to set out a principled approach for handling missing data in clinical trials, and provide examples and code to facilitate its adoption. Data sources: An asthma trial from GlaxoSmithKline, a asthma trial from AstraZeneca, and a dental pain trial from GlaxoSmithKline. Methods: Part I gives a non-technical review how missing data are typically handled in clinical trials, and the issues raised by missing data. When faced with missing data, we show no analysis can avoid making additional untestable assumptions. This leads to a proposal for a systematic, principled approach for handling missing data in clinical trials, which in turn informs a critique of current Committee of Proprietary Medicinal Products guidelines for missing data, together with many of the ad-hoc statistical methods currently employed. Part II shows how primary analyses in a range of settings can be carried out under the so-called missing at random assumption. This key assumption has a central role in underpinning the most important classes of primary analysis, such as those based on likelihood. However its validity cannot be assessed from the data under analysis, so in Part III, two main approaches are developed and illustrated for the assessment of the sensitivity of the primary analyses to this assumption. Results: The literature review revealed missing data are often ignored, or poorly handled in the analysis. Current guidelines, and frequently used ad-hoc statistical methods are shown to be flawed. A principled, yet practical, alternative approach is developed, which examples show leads inferences with greater validity. SAS code is given to facilitate its direct application. Conclusions: From the design stage onwards, a principled approach to handling missing data should be adopted. Such an approach follows well-defined and accepted statistical arguments, using models and assumptions that are transparent, and hence open to criticism and debate. This monograph outlines how this principled approach can be practically, and directly, applied to the majority of trials with longitudinal follow-up

    Multiple imputation methods for bivariate outcomes in cluster randomised trials.

    Get PDF
    Missing observations are common in cluster randomised trials. The problem is exacerbated when modelling bivariate outcomes jointly, as the proportion of complete cases is often considerably smaller than the proportion having either of the outcomes fully observed. Approaches taken to handling such missing data include the following: complete case analysis, single-level multiple imputation that ignores the clustering, multiple imputation with a fixed effect for each cluster and multilevel multiple imputation. We contrasted the alternative approaches to handling missing data in a cost-effectiveness analysis that uses data from a cluster randomised trial to evaluate an exercise intervention for care home residents. We then conducted a simulation study to assess the performance of these approaches on bivariate continuous outcomes, in terms of confidence interval coverage and empirical bias in the estimated treatment effects. Missing-at-random clustered data scenarios were simulated following a full-factorial design. Across all the missing data mechanisms considered, the multiple imputation methods provided estimators with negligible bias, while complete case analysis resulted in biased treatment effect estimates in scenarios where the randomised treatment arm was associated with missingness. Confidence interval coverage was generally in excess of nominal levels (up to 99.8%) following fixed-effects multiple imputation and too low following single-level multiple imputation. Multilevel multiple imputation led to coverage levels of approximately 95% throughout. © 2016 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd

    A penalized framework for distributed lag non-linear models.

    Get PDF
    : Distributed lag non-linear models (DLNMs) are a modelling tool for describing potentially non-linear and delayed dependencies. Here, we illustrate an extension of the DLNM framework through the use of penalized splines within generalized additive models (GAM). This extension offers built-in model selection procedures and the possibility of accommodating assumptions on the shape of the lag structure through specific penalties. In addition, this framework includes, as special cases, simpler models previously proposed for linear relationships (DLMs). Alternative versions of penalized DLNMs are compared with each other and with the standard unpenalized version in a simulation study. Results show that this penalized extension to the DLNM class provides greater flexibility and improved inferential properties. The framework exploits recent theoretical developments of GAMs and is implemented using efficient routines within freely available software. Real-data applications are illustrated through two reproducible examples in time series and survival analysis.<br/

    Bayesian models for weighted data with missing values: a bootstrap approach

    Get PDF
    Many data sets, especially from surveys, are made available to users with weights. Where the derivation of such weights is known, this information can often be incorporated in the user's substantive model (model of interest). When the derivation is unknown, the established procedure is to carry out a weighted analysis. However, with non‐trivial proportions of missing data this is inefficient and may be biased when data are not missing at random. Bayesian approaches provide a natural approach for the imputation of missing data, but it is unclear how to handle the weights. We propose a weighted bootstrap Markov chain Monte Carlo algorithm for estimation and inference. A simulation study shows that it has good inferential properties. We illustrate its utility with an analysis of data from the Millennium Cohort Study

    Multiple imputation for IPD meta-analysis: allowing for heterogeneity and studies with missing covariates.

    Get PDF
    Recently, multiple imputation has been proposed as a tool for individual patient data meta-analysis with sporadically missing observations, and it has been suggested that within-study imputation is usually preferable. However, such within study imputation cannot handle variables that are completely missing within studies. Further, if some of the contributing studies are relatively small, it may be appropriate to share information across studies when imputing. In this paper, we develop and evaluate a joint modelling approach to multiple imputation of individual patient data in meta-analysis, with an across-study probability distribution for the study specific covariance matrices. This retains the flexibility to allow for between-study heterogeneity when imputing while allowing (i) sharing information on the covariance matrix across studies when this is appropriate, and (ii) imputing variables that are wholly missing from studies. Simulation results show both equivalent performance to the within-study imputation approach where this is valid, and good results in more general, practically relevant, scenarios with studies of very different sizes, non-negligible between-study heterogeneity and wholly missing variables. We illustrate our approach using data from an individual patient data meta-analysis of hypertension trials. © 2015 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd

    Reference based sensitivity analysis for longitudinal trials with protocol deviation via multiple imputation

    Get PDF
    Randomised controlled trials provide essential evidence for the evaluation of new and existing medical treatments. Unfortunately the statistical analysis is often complicated by the occurrence of protocol deviations, which mean we cannot always measure the intended outcomes for individuals who deviate, resulting in a missing data problem. In such settings, however one approaches the analysis, an untestable assumption about the distribution of the unobserved data must be made. To understand how far the results depend on these assumptions, the primary analysis should be supplemented by a range of sensitivity analyses, which explore how the conclusions vary over a range of different credible assumptions for the missing data. In this article we describe a new command, mimix, that can be used to perform reference based sensitivity analyses for randomised controlled trials with longitudinal quantitative outcome data, using the approach proposed by Carpenter, Roger, and Kenward (2013). Under this approach, we make qualitative assumptions about how individuals' missing outcomes relate to those observed in relevant groups in the trial, based on plausible clinical scenarios. Statistical analysis then proceeds using the method of multiple imputation

    Information anchored reference‐based sensitivity analysis for truncated normal data with application to survival analysis

    Get PDF
    The primary analysis of time-to-event data typically makes the censoring at random assumption, that is, that—conditional on covariates in the model—the distribution of event times is the same, whether they are observed or unobserved. In such cases, we need to explore the robustness of inference to more pragmatic assumptions about patients post-censoring in sensitivity analyses. Reference-based multiple imputation, which avoids analysts explicitly specifying the parameters of the unobserved data distribution, has proved attractive to researchers. Building on results for longitudinal continuous data, we show that inference using a Tobit regression imputation model for reference-based sensitivity analysis with right censored log normal data is information anchored, meaning the proportion of information lost due to missing data under the primary analysis is held constant across the sensitivity analyses. We illustrate our theoretical results using simulation and a clinical trial case study

    A review of RCTs in four medical journals to assess the use of imputation to overcome missing data in quality of life outcomes

    Get PDF
    Background: Randomised controlled trials (RCTs) are perceived as the gold-standard method for evaluating healthcare interventions, and increasingly include quality of life (QoL) measures. The observed results are susceptible to bias if a substantial proportion of outcome data are missing. The review aimed to determine whether imputation was used to deal with missing QoL outcomes. Methods: A random selection of 285 RCTs published during 2005/6 in the British Medical Journal, Lancet, New England Journal of Medicine and Journal of American Medical Association were identified. Results: QoL outcomes were reported in 61 (21%) trials. Six (10%) reported having no missing data, 20 (33%) reported ≤ 10% missing, eleven (18%) 11%–20% missing, and eleven (18%) reported >20% missing. Missingness was unclear in 13 (21%). Missing data were imputed in 19 (31%) of the 61 trials. Imputation was part of the primary analysis in 13 trials, but a sensitivity analysis in six. Last value carried forward was used in 12 trials and multiple imputation in two. Following imputation, the most common analysis method was analysis of covariance (10 trials). Conclusion: The majority of studies did not impute missing data and carried out a complete-case analysis. For those studies that did impute missing data, researchers tended to prefer simpler methods of imputation, despite more sophisticated methods being available.The Health Services Research Unit is funded by the Chief Scientist Office of the Scottish Government Health Directorate. Shona Fielding is also currently funded by the Chief Scientist Office on a Research Training Fellowship (CZF/1/31)
    corecore