Models for Count Data in the Presence of Outliers and/or Excess Zero

Abstract

Violations of Poisson assumptions usually result in overdispersion, where the variance of the model exceeds the value of the mean. Excess or (deficiency) of zero counts result in overdispersion. Violations of equidispersion indicate correlation in the data, which affect standard errors of the parameter estimates. Model fit is also affected. (Hilbe 2008). Therefore, this study examined the impact of outliers and excess zero on count data in causing overdispersion. The study focus on identifying model(s) which can handle the impact of outliers and excess zero in count data. Datasets based on Poisson model were simulated for sample sizes 20, 50 and 100 and incorporated with outliers and excess zero. Maximum likelihood estimation method was employed in estimating the parameters. Model selection is based on dispersion index, AIC, BIC and log likelihood statistics, putting into consideration Poisson, Negative Binomial, Zero Inflated Poisson and Zero Inflated Negative Binomial models and results obtained indicates that ZINB is the best models for analyzing count data in the presence of outliers and/or excess zero. Keywords: Count data, Overdispersion, Excess zero, outliers, Goodness of fit, Poisson, Negative Binomial and Zero inflated model

    Similar works