Hybrid dragonfly algorithm with neighbourhood component analysis and gradient tree boosting for crime rates modelling

Abstract

In crime studies, crime rates time series prediction helps in strategic crime prevention formulation and decision making. Statistical models are commonly applied in predicting time series crime rates. However, the time series crime rates data are limited and mostly nonlinear. One limitation in the statistical models is that they are mainly linear and are only able to model linear relationships. Thus, this study proposed a time series crime prediction model that can handle nonlinear components as well as limited historical crime rates data. Recently, Artificial Intelligence (AI) models have been favoured as they are able to handle nonlinear and robust to small sample data components in crime rates. Hence, the proposed crime model implemented an artificial intelligence model namely Gradient Tree Boosting (GTB) in modelling the crime rates. The crime rates are modelled using the United States (US) annual crime rates of eight crime types with nine factors that influence the crime rates. Since GTB has no feature selection, this study proposed hybridisation of Neighbourhood Component Analysis (NCA) and GTB (NCA-GTB) in identifying significant factors that influence the crime rates. Also, it was found that both NCA and GTB are sensitive to input parameter. Thus, DA2-NCA-eGTB model was proposed to improve the NCA-GTB model. The DA2-NCA-eGTB model hybridised a metaheuristic optimisation algorithm namely Dragonfly Algorithm (DA) with NCA-GTB model to optimise NCA and GTB parameters. In addition, DA2-NCA-eGTB model also improved the accuracy of the NCA-GTB model by using Least Absolute Deviation (LAD) as the GTB loss function. The experimental result showed that DA2-NCA-eGTB model outperformed existing AI models in all eight modelled crime types. This was proven by the smaller values of Mean Absolute Percentage Error (MAPE), which was between 2.9195 and 18.7471. As a conclusion, the study showed that DA2-NCA-eGTB model is statistically significant in representing all crime types and it is able to handle the nonlinear component in limited crime rate data well

    Similar works