59 research outputs found

    Significance tests and estimates for R2 for multiple regression in multiply imputed datasets: A cautionary note on earlier findings, and alternative solutions

    Get PDF
    Whenever multiple regression is applied to a multiply imputed data set, several methods for combining significance tests for R2 and the change in R2 across imputed data sets may be used: the combination rules by Rubin, the Fisher z-test for R2 by Harel, and F-tests for the change in R2 by Chaurasia and Harel. For pooling R2 itself, Harel proposed a method based on a Fisher z transformation. In the current article, it is argued that the pooled R2 based on the Fisher z transformation, the Fisher z-test for R2, and the F-test for the change in R2 have some theoretical flaws. An argument is made for using Rubin’s method for pooling significance tests for R2 instead, and alternative procedures for pooling R2 are proposed: simple averaging and a pooled R2 constructed from the pooled significance test by Rubin. Simulations show that the Fisher z-test and Chaurasia and Harel’s F-tests generally give inflated type-I error rates, whereas the type-I error rates of Rubin’s method are correct. Of the methods for pooling the point estimates of R2 no method clearly performs best, but it is argued that the average of R2’s across imputed data set is preferred.Multivariate analysis of psychological dat

    Multiple imputation of incomplete categorical data using latent class analysis

    Get PDF
    We propose using latent class analysis as an alternative to log-linear analysis for the multiple imputation of incomplete cate-gorical data. Similar to log-linear models, latent class models can be used to describe complex association structures between the variables used in the imputation model. However, unlike log-linear models, latent class models can be used to build large im-putation models containing more than a few categorical variables. To obtain imputations reflecting uncertainty about the unknown model parameters, we use a nonparametric bootstrap procedure as an alternative to the more common full Bayesian approach. The proposed multiple imputation method, which is implemented in Latent GOLD software for latent class analysis, is illustrated with two examples. In a simulated data example, we compare the new method to well-established methods such as maximum likelihood We would like to thank Paul Allison and Jay Magidson, as well as the editor and the three anonymous reviewers, for their comments, which very much helped to improve this article. We would also like to thank Greg Richards for providing th

    Multiple imputation in data that grow over time: A comparison of three strategies

    Get PDF
    Multiple imputation is a recommended technique to deal with missing data. We study the problem where the investigator has already created imputations before the arrival of the next wave of data. The newly arriving data contain missing values that need to be imputed. The standard method (RE-IMPUTE) is to combine the new and old data before imputation, and re-impute all missing values in the combined data. We study the properties of two methods that impute the missing data in the new part only, thus preserving the historic imputations. Method NEST multiply imputes the new data conditional on each filled-in old data (Formula presented.) times. Method APPEND is the special case of NEST with (Formula presented.) thus appending each filled-in data by single imputation. We found that NEST and APPEND have the same validity as RE-IMPUTE for monotone missing data-patterns. NEST and APPEND also work well when relations within waves are stronger than between waves and for moderate percentages of missing data. We do not recommend the use of NEST or APPEND when relations within time points are weak and when associations between time points are strong

    Rebutting existing misconceptions about multiple imputation as a method for handling missing data

    Get PDF
    Missing data is a problem that occurs frequently in many scientific areas. The most sophisticatedmethod for dealing with this problem is multiple imputation. Contrary to other methods, like listwise deletion, this method does not throw away information, and partly repairs the problem ofsystematic dropout. Although from a theoretical point of view multiple imputation is consideredto be the optimal method, many applied researchers are reluctant to use it because of persistentmisconceptions about this method. Instead of providing an(other) overview of missing data methods, or extensively explaining how multiple imputation works, this article aims specifically atrebutting these misconceptions, and provides applied researchers with practical arguments supporting them in the use of multiple imputation.Multivariate analysis of psychological dat
    • …
    corecore