2,546 research outputs found
Missing.... presumed at random: cost-analysis of incomplete data
When collecting patient-level resource use data for statistical analysis, for some patients and in some categories of resource use, the required count will not be observed. Although this problem must arise in most reported economic evaluations containing patient-level data, it is rare for authors to detail how the problem was overcome. Statistical packages may default to handling missing data through a so-called complete case analysis, while some recent cost-analyses have appeared to favour an available case approach. Both of these methods are problematic: complete case analysis is inefficient and is likely to be biased; available case analysis, by employing different numbers of observations for each resource use item, generates severe problems for standard statistical inference. Instead we explore imputation methods for generating replacement values for missing data that will permit complete case analysis using the whole data set and we illustrate these methods using two data sets that had incomplete resource use information
Multiple Imputation Using Gaussian Copulas
Missing observations are pervasive throughout empirical research, especially
in the social sciences. Despite multiple approaches to dealing adequately with
missing data, many scholars still fail to address this vital issue. In this
paper, we present a simple-to-use method for generating multiple imputations
using a Gaussian copula. The Gaussian copula for multiple imputation (Hoff,
2007) allows scholars to attain estimation results that have good coverage and
small bias. The use of copulas to model the dependence among variables will
enable researchers to construct valid joint distributions of the data, even
without knowledge of the actual underlying marginal distributions. Multiple
imputations are then generated by drawing observations from the resulting
posterior joint distribution and replacing the missing values. Using simulated
and observational data from published social science research, we compare
imputation via Gaussian copulas with two other widely used imputation methods:
MICE and Amelia II. Our results suggest that the Gaussian copula approach has a
slightly smaller bias, higher coverage rates, and narrower confidence intervals
compared to the other methods. This is especially true when the variables with
missing data are not normally distributed. These results, combined with
theoretical guarantees and ease-of-use suggest that the approach examined
provides an attractive alternative for applied researchers undertaking multiple
imputations
An Empirical Comparison of Multiple Imputation Methods for Categorical Data
Multiple imputation is a common approach for dealing with missing values in
statistical databases. The imputer fills in missing values with draws from
predictive models estimated from the observed data, resulting in multiple,
completed versions of the database. Researchers have developed a variety of
default routines to implement multiple imputation; however, there has been
limited research comparing the performance of these methods, particularly for
categorical data. We use simulation studies to compare repeated sampling
properties of three default multiple imputation methods for categorical data,
including chained equations using generalized linear models, chained equations
using classification and regression trees, and a fully Bayesian joint
distribution based on Dirichlet Process mixture models. We base the simulations
on categorical data from the American Community Survey. In the circumstances of
this study, the results suggest that default chained equations approaches based
on generalized linear models are dominated by the default regression tree and
Bayesian mixture model approaches. They also suggest competing advantages for
the regression tree and Bayesian mixture model approaches, making both
reasonable default engines for multiple imputation of categorical data. A
supplementary material for this article is available online
- …