research

Visualizing Count Data Regressions Using Rootograms

Abstract

The rootogram is a graphical tool associated with the work of J. W. Tukey that was originally used for assessing goodness of fit of univariate distributions. Here we show that rootograms are also useful for diagnosing and treating issues such as overdispersion and/or excess zeros in regression models for count data. We also introduce a weighted version of the rootogram that can be applied out of sample or to (weighted) subsets of the data, e.g., in finite mixture models. Two empirical illustrations are included, one from ethology, the other from public health. The former employs a negative binomial hurdle regression, the latter a two-component finite mixture of negative binomial models. The rootogram is a graphical tool associated with the work of J.  W. Tukey that was originally used for assessing goodness of fit of univariate distributions. Here we show that rootograms are also useful for diagnosing and treating issues such as overdispersion and/or excess zeros in regression models for count data. We also introduce a weighted version of the rootogram that can be applied out of sample or to (weighted) subsets of the data, e.g., in finite mixture models. Two empirical illustrations are included, one from ethology, the other from public health. The former employs a negative binomial hurdle regression, the latter a two-component finite mixture of negative binomial models. A further illustration involving underdispersion and an R implementation of our tools are available in the R package 'countreg'

    Similar works