The latent class model is a powerful unsupervised clustering algorithm for
categorical data. Many statistics exist to test the fit of the latent class
model. However, traditional methods to evaluate those fit statistics are not
always useful. Asymptotic distributions are not always known, and empirical
reference distributions can be very time consuming to obtain. In this paper we
propose a fast resampling scheme with which any type of model fit can be
assessed. We illustrate it here on the latent class model, but the methodology
can be applied in any situation.
The principle behind the lazy bootstrap method is to specify a statistic
which captures the characteristics of the data that a model should capture
correctly. If those characteristics in the observed data and in model-generated
data are very different we can assume that the model could not have produced
the observed data. With this method we achieve the flexibility of tests from
the Bayesian framework, while only needing maximum likelihood estimates. We
provide a step-wise algorithm with which the fit of a model can be assessed
based on the characteristics we as researcher find important. In a Monte Carlo
study we show that the method has very low type I errors, for all illustrated
statistics. Power to reject a model depended largely on the type of statistic
that was used and on sample size. We applied the method to an empirical data
set on clinical subgroups with risk of Myocardial infarction and compared the
results directly to the parametric bootstrap. The results of our method were
highly similar to those obtained by the parametric bootstrap, while the
required computations differed three orders of magnitude in favour of our
method.Comment: This is an adaptation of chapter of a PhD dissertation available at
https://pure.uvt.nl/portal/files/19030880/Kollenburg_Computer_13_11_2017.pd