Assessment of Statistical Approaches to Model Low Count Data: An Empirical Application to Youth Delinquency

Abstract

Objectives: The aim of this study was to identify the risk factors associated with number of crime committed by youth (Youth Delinquency) between ages 10-17, using Ordinary Least Square (OLS), Poisson Regression model (PRM), Negative Binomial Regression model (NBRM)& Zero Inflated Negative Binomial (ZINB) with the aim to choose the most appropriate model for the observed count data.Methodology: The data in the study was collected from youth whose mothers enrolled in Philadelphia Collaborative Perinatal Project (CPP). School and delinquency record (between ages 10-17) was obtained by the Centre for studies in Criminology and Criminal Law. Literature search suggest that factors associated with child delinquency can be divided into four main factors as Individual, Family, School and Peer. Therefore we included variables in the analysis accordingly.Result: For OLS scatter plot of residuals versus estimated counts showed definite pattern of heterogeneity (non-constant variance). The likelihood-ratio (LR) test of over dispersion yields the significant p-value, which implied that the outcome variable is overdispersed. The plot of the difference between the actual probabilities and the mean predicted probabilities for each model showed that PRM has poor predictions for low counts (0-2).Conclusion: NBRM and ZINB both performed well, however fit statistics revealed that NBRM has provided more closed predication as compare ZINB.NB modeling techniques provides much more compelling and accurate results instead of basic PRM or those available through simple linear or log-linear modeling techniques

    Similar works