100 research outputs found

    Multiple Imputation for Individual Patient Data Meta-Analyses.

    Get PDF
    The term meta-analysis refers to a set of statistical techniques for combining findings from different studies in order to draw more definitive conclusions about some treatment or exposure effect of interest in a particular context. Recently, meta-analyses which aim to combine the individual observations collected in each study, instead of simple summary measures, have been gaining in popularity in medical research. The main advantage of this so-called Individual Patient Data Meta-Analyses (IPD-MA) is that they have much more statistical power to investigate heterogeneity of the contributing studies and to explore treatment covariate effects. Unfortunately, missing data are a common problem that affects nearly every dataset in clinical or epidemiological studies and therefore also the meta-analyses of such datasets. When not handled properly, missing data can lead to invalid inferences and therefore a lot of research work has focussed on deriving, implementing and disseminating appropriate methods. The motivation for this thesis comes from two IPD-MA, called INDANA and MAGGIC. Some challenges introduced by missing data in these projects include the presence of wholly missing variables in some studies, the variety of types of partially observed variables and the presence of interactions and non-linearities in the substantive models of interest. In this thesis we propose a Joint Modelling Multiple Imputation (JM-MI) approach to overcome these issues. Motivated by the lack of available software, in the first part of this thesis we develop and describe jomo, a new R package for Multilevel MI. A key feature of jomo compared to other packages for MI, is that it allows for the presence of random, or fixed, study-specific covariance matrices in the imputation model, therefore allowing for heteroscedasticity when imputing. Successively we use this package to prove how our proposed method can be as good as standard methods used nowadays to treat missing data in IPD-MA with partially observed continuous variables. Furthermore we show how it performs in more challenging situations, i.e. to impute missing data in studies with few observations or even with systematically missing variables. We then extend the method to include partially observed variables that are not continuous, developing and evaluating a strategy based on latent normal variables to impute categorical data. Finally we use the methods introduced to impute missing data in the two motivating metaanalyses, INDANA and MAGGIC

    Growth and CD4 patterns of adolescents living with perinatally acquired HIV worldwide, a CIPHER cohort collaboration analysis

    Get PDF
    INTRODUCTION: Adolescents living with HIV are subject to multiple co-morbidities, including growth retardation and immunodeficiency. We describe growth and CD4 evolution during adolescence using data from the Collaborative Initiative for Paediatric HIV Education and Research (CIPHER) global project. METHODS: Data were collected between 1994 and 2015 from 11 CIPHER networks worldwide. Adolescents with perinatally acquired HIV infection (APH) who initiated antiretroviral therapy (ART) before age 10 years, with at least one height or CD4 count measurement while aged 10–17 years, were included. Growth was measured using height-for-age Z-scores (HAZ, stunting if <-2 SD, WHO growth charts). Linear mixed-effects models were used to study the evolution of each outcome between ages 10 and 17. For growth, sex-specific models with fractional polynomials were used to model non-linear relationships for age at ART initiation, HAZ at age 10 and time, defined as current age from 10 to 17 years of age. RESULTS: A total of 20,939 and 19,557 APH were included for the growth and CD4 analyses, respectively. Half were females, two-thirds lived in East and Southern Africa, and median age at ART initiation ranged from 7 years in sub-Saharan African regions. At age 10, stunting ranged from 6% in North America and Europe to 39% in the Asia-Pacific; 19% overall had CD4 counts <500 cells/mm3. Across adolescence, higher HAZ was observed in females and among those in high-income countries. APH with stunting at age 10 and those with late ART initiation (after age 5) had the largest HAZ gains during adolescence, but these gains were insufficient to catch-up with non-stunted, early ART-treated adolescents. From age 10 to 16 years, mean CD4 counts declined from 768 to 607 cells/mm3. This decline was observed across all regions, in males and females. CONCLUSIONS: Growth patterns during adolescence differed substantially by sex and region, while CD4 patterns were similar, with an observed CD4 decline that needs further investigation. Early diagnosis and timely initiation of treatment in early childhood to prevent growth retardation and immunodeficiency are critical to improving APH growth and CD4 outcomes by the time they reach adulthood

    The DURATIONS randomised trial design: Estimation targets, analysis methods and operating characteristics

    Get PDF
    Background. Designing trials to reduce treatment duration is important in several therapeutic areas, including TB and antibiotics. We recently proposed a new randomised trial design to overcome some of the limitations of standard two-arm non-inferiority trials. This DURATIONS design involves randomising patients to a number of duration arms, and modelling the so-called duration-response curve. This article investigates the operating characteristics (type-1 and type-2 errors) of different statistical methods of drawing inference from the estimated curve. Methods. Our first estimation target is the shortest duration non-inferior to the control (maximum) duration within a specific risk difference margin. We compare different methods of estimating this quantity, including using model confidence bands, the delta method and bootstrap. We then explore the generalisability of results to estimation targets which focus on absolute event rates, risk ratio and gradient of the curve. Results. We show through simulations that, in most scenarios and for most of the estimation targets, using the bootstrap to estimate variability around the target duration leads to good results for DURATIONS design-appropriate quantities analogous to power and type-1 error. Using model confidence bands is not recommended, while the delta method leads to inflated type-1 error in some scenarios, particularly when the optimal duration is very close to one of the randomised durations. Conclusions. Using the bootstrap to estimate the optimal duration in a DURATIONS design has good operating characteristics in a wide range of scenarios, and can be used with confidence by researchers wishing to design a DURATIONS trial to reduce treatment duration. Uncertainty around several different targets can be estimated with this bootstrap approach

    Multiple imputation for IPD meta-analysis: allowing for heterogeneity and studies with missing covariates

    Get PDF
    Recently, multiple imputation has been proposed as a tool for individual patient data meta‐analysis with sporadically missing observations, and it has been suggested that within‐study imputation is usually preferable. However, such within study imputation cannot handle variables that are completely missing within studies. Further, if some of the contributing studies are relatively small, it may be appropriate to share information across studies when imputing. In this paper, we develop and evaluate a joint modelling approach to multiple imputation of individual patient data in meta‐analysis, with an across‐study probability distribution for the study specific covariance matrices. This retains the flexibility to allow for between‐study heterogeneity when imputing while allowing (i) sharing information on the covariance matrix across studies when this is appropriate, and (ii) imputing variables that are wholly missing from studies. Simulation results show both equivalent performance to the within‐study imputation approach where this is valid, and good results in more general, practically relevant, scenarios with studies of very different sizes, non‐negligible between‐study heterogeneity and wholly missing variables. We illustrate our approach using data from an individual patient data meta‐analysis of hypertension trials

    Multiple imputation for IPD meta-analysis: allowing for heterogeneity and studies with missing covariates.

    Get PDF
    Recently, multiple imputation has been proposed as a tool for individual patient data meta-analysis with sporadically missing observations, and it has been suggested that within-study imputation is usually preferable. However, such within study imputation cannot handle variables that are completely missing within studies. Further, if some of the contributing studies are relatively small, it may be appropriate to share information across studies when imputing. In this paper, we develop and evaluate a joint modelling approach to multiple imputation of individual patient data in meta-analysis, with an across-study probability distribution for the study specific covariance matrices. This retains the flexibility to allow for between-study heterogeneity when imputing while allowing (i) sharing information on the covariance matrix across studies when this is appropriate, and (ii) imputing variables that are wholly missing from studies. Simulation results show both equivalent performance to the within-study imputation approach where this is valid, and good results in more general, practically relevant, scenarios with studies of very different sizes, non-negligible between-study heterogeneity and wholly missing variables. We illustrate our approach using data from an individual patient data meta-analysis of hypertension trials. © 2015 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd

    Multiple Imputation with Survey Weights: A Multilevel Approach

    Get PDF
    Abstract Multiple imputation is now well established as a practical and flexible method for analyzing partially observed data, particularly under the missing at random assumption. However, when the substantive model is a weighted analysis, there is concern about the empirical performance of Rubin’s rules and also about how to appropriately incorporate possible interaction between the weights and the distribution of the study variables. One approach that has been suggested is to include the weights in the imputation model, potentially also allowing for interactions with the other variables. We show that the theoretical criterion justifying this approach can be approximately satisfied if we stratify the weights to define level-two units in our data set and include random intercepts in the imputation model. Further, if we let the covariance matrix of the variables have a random distribution across the level-two units, we also allow imputation to reflect any interaction between weight strata and the distribution of the variables. We evaluate our proposal in a number of simulation scenarios, showing it has promising performance both in terms of coverage levels of the model parameters and bias of the associated Rubin’s variance estimates. We illustrate its application to a weighted analysis of factors predicting reception-year readiness in children in the UK Millennium Cohort Study.</jats:p
