5,466 research outputs found

    Missing.... presumed at random: cost-analysis of incomplete data

    Get PDF
    When collecting patient-level resource use data for statistical analysis, for some patients and in some categories of resource use, the required count will not be observed. Although this problem must arise in most reported economic evaluations containing patient-level data, it is rare for authors to detail how the problem was overcome. Statistical packages may default to handling missing data through a so-called complete case analysis, while some recent cost-analyses have appeared to favour an available case approach. Both of these methods are problematic: complete case analysis is inefficient and is likely to be biased; available case analysis, by employing different numbers of observations for each resource use item, generates severe problems for standard statistical inference. Instead we explore imputation methods for generating replacement values for missing data that will permit complete case analysis using the whole data set and we illustrate these methods using two data sets that had incomplete resource use information

    Highly Irregular Functional Generalized Linear Regression with Electronic Health Records

    Full text link
    This work presents a new approach, called MISFIT, for fitting generalized functional linear regression models with sparsely and irregularly sampled data. Current methods do not allow for consistent estimation unless one assumes that the number of observed points per curve grows sufficiently quickly with the sample size. In contrast, MISFIT is based on a multiple imputation framework, which has the potential to produce consistent estimates without such an assumption. Just as importantly, it propagates the uncertainty of not having completely observed curves, allowing for a more accurate assessment of the uncertainty of parameter estimates, something that most methods currently cannot accomplish. This work is motivated by a longitudinal study on macrocephaly, or atypically large head size, in which electronic medical records allow for the collection of a great deal of data. However, the sampling is highly variable from child to child. Using MISFIT we are able to clearly demonstrate that the development of pathologic conditions related to macrocephaly is associated with both the overall head circumference of the children as well as the velocity of their head growth.Comment: 5 figures, 17 tables (including supplementary material), 34 pages (including supplementary material

    MIDAS: A SAS Macro for Multiple Imputation Using Distance-Aided Selection of Donors

    Get PDF
    In this paper we describe MIDAS: a SAS macro for multiple imputation using distance aided selection of donors which implements an iterative predictive mean matching hot-deck for imputing missing data. This is a flexible multiple imputation approach that can handle data in a variety of formats: continuous, ordinal, and scaled. Because the imputation models are implicit, it is not necessary to specify a parametric distribution for each variable to be imputed. MIDAS also allows the user to address the sensitivity of their inferences to different assumptions concerning the missing data mechanism. An example using MIDAS to impute missing data is presented and MIDAS is compared to existing missing data software.

    Validation of methods for converting the original Disease Activity Score (DAS) to the DAS28

    Get PDF
    © The Author(s) 2018.The Disease Activity Score (DAS) is integral in tailoring the clinical management of rheumatoid arthritis (RA) patients and is an important measure in clinical research. Different versions have been developed over the years to improve reliability and ease of use. Combining the original DAS and the newer DAS28 data in both contemporary and historical studies is important for both primary and secondary data analyses. As such, a methodologically robust means of converting the old DAS to the new DAS28 measure would be invaluable. Using data from The Early RA Study (ERAS), a sub-sample of patients with both DAS and DAS28 data were used to develop new regression imputation formulas using the total DAS score (univariate), and using the separate components of the DAS score (multivariate). DAS were transformed to DAS28 using an existing formula quoted in the literature, and the newly developed formulas. Bland and Altman plots were used to compare the transformed DAS with the recorded DAS28 to ascertain levels of agreement. The current transformation formula tended to overestimate the true DAS28 score, particularly at the higher end of the scale. A formula which uses all separate components of the DAS was found to estimate the scores with a higher level of precision. A new formula is proposed that can be used by other early RA cohorts to convert the original DAS to DAS28.Peer reviewedFinal Published versio

    Sequential Regression Multiple Imputation for Incomplete Multivariate Data using Markov Chain Monte Carlo

    Get PDF
    This paper discusses the theoretical background to handling missing data in a multivariate context. Earlier methods for dealing with item non-response are reviewed, followed by an examination of some of the more modern methods and, in particular, multiple imputation. One such technique, known as sequential regression multivariate imputation, which employs a Markov chain Monte Carlo algorithm is described and implemented. It is demonstrated that distributional convergence is rapid and only a few imputations are necessary in order to produce accurate point estimates and preserve multivariate relationships, whilst adequately accounting for the uncertainty introduced by the imputation procedure. It is further shown that lower fractions of missing data and the inclusion of relevant covariates in the imputation model are desirable in terms of bias reduction.Missing data; Item non-response; Missingness mechanism; Imputation; Regression; Markov chain Monte Carlo.
    corecore