6 research outputs found

    Big data analytics and mining for effective visualization and trends forecasting of crime data.

    Get PDF
    Big data analytics (BDA) is a systematic approach for analyzing and identifying different patterns, relations, and trends within a large volume of data. In this paper, we apply BDA to criminal data where exploratory data analysis is conducted for visualization and trends prediction. Several the state-of-the-art data mining and deep learning techniques are used. Following statistical analysis and visualization, some interesting facts and patterns are discovered from criminal data in San Francisco, Chicago, and Philadelphia. The predictive results show that the Prophet model and Keras stateful LSTM perform better than neural network models, where the optimal size of the training data is found to be three years. These promising outcomes will benefit for police departments and law enforcement organizations to better understand crime issues and provide insights that will enable them to track activities, predict the likelihood of incidents, effectively deploy resources and optimize the decision making process

    Determinación de Criminales Potenciales en Análisis de Textos: Caso de Estudio

    Get PDF
    Esta investigación está orientada a clasificar textos usando Redes Neuronales Artificiales (RNA) específicamente el Perceptron Multicapa (PMC) con Técnicas básicas de palabras embebidas. La clasificación consiste en determinar ya sea que el texto tenga un contexto criminal o no por medio de reconocimiento de patrones. El PMC fue entrenado bajo entrenamiento supervisado y en un rango corto de vocabulario y registros de entrenamiento, cada uno de los cuales tiene una longitud máxima de 300 palabras para hacer procesos de clasificación. Analizar estos tipos de textos podría ayudar a las fuerzas de seguridad del gobierno, a los militares, etc. para fácilmente detectar gente que podría dañar a la población y predecir posibles ataques y prevenirlos. El software desarrollado necesita más técnicas de palabras embebidas, un vocabulario más grande y más registros de entrenamiento para ser más eficiente. El conjunto de datos consiste de dos clases principales que están organizadas como textos de tipo criminal y regular

    Big Data Analytics and Mining for Effective Visualization and Trends Forecasting of Crime Data

    Get PDF
    Big data analytics (BDA) is a systematic approach for analyzing and identifying different patterns, relations, and trends within a large volume of data. In this paper, we apply BDA to criminal data where exploratory data analysis is conducted for visualization and trends prediction. Several the state-of-the-art data mining and deep learning techniques are used. Following statistical analysis and visualization, some interesting facts and patterns are discovered from criminal data in San Francisco, Chicago, and Philadelphia. The predictive results show that the Prophet model and Keras stateful LSTM perform better than neural network models, where the optimal size of the training data is found to be three years. These promising outcomes will benefit for police departments and law enforcement organizations to better understand crime issues and provide insights that will enable them to track activities, predict the likelihood of incidents, effectively deploy resources and optimize the decision making process

    Spatio-temporal crime HotSpot detection and prediction: a systematic literature review

    Get PDF
    The primary objective of this study is to accumulate, summarize, and evaluate the state-of-the-art for spatio-temporal crime hotspot detection and prediction techniques by conducting a systematic literature review (SLR). The authors were unable to find a comprehensive study on crime hotspot detection and prediction while conducting this SLR. Therefore, to the best of author's knowledge, this study is the premier attempt to critically analyze the existing literature along with presenting potential challenges faced by current crime hotspot detection and prediction systems. The SLR is conducted by thoroughly consulting top five scientific databases (such as IEEE, Science Direct, Springer, Scopus, and ACM), and synthesized 49 different studies on crime hotspot detection and prediction after critical review. This study unfolds the following major aspects: 1) the impact of data mining and machine learning approaches, especially clustering techniques in crime hotspot detection; 2) the utility of time series analysis techniques and deep learning techniques in crime trend prediction; 3) the inclusion of spatial and temporal information in crime datasets making the crime prediction systems more accurate and reliable; 4) the potential challenges faced by the state-of-the-art techniques and the future research directions. Moreover, the SLR aims to provide a core foundation for the research on spatio-temporal crime prediction applications while highlighting several challenges related to the accuracy of crime hotspot detection and prediction applications

    Real Time Crime Prediction Using Social Media

    Get PDF
    There is no doubt that crime is on the increase and has a detrimental influence on a nation's economy despite several attempts of studies on crime prediction to minimise crime rates. Historically, data mining techniques for crime prediction models often rely on historical information and its mostly country specific. In fact, only a few of the earlier studies on crime prediction follow standard data mining procedure. Hence, considering the current worldwide crime trend in which criminals routinely publish their criminal intent on social media and ask others to see and/or engage in different crimes, an alternative, and more dynamic strategy is needed. The goal of this research is to improve the performance of crime prediction models. Thus, this thesis explores the potential of using information on social media (Twitter) for crime prediction in combination with historical crime data. It also figures out, using data mining techniques, the most relevant feature engineering needed for United Kingdom dataset which could improve crime prediction model performance. Additionally, this study presents a function that could be used by every state in the United Kingdom for data cleansing, pre-processing and feature engineering. A shinny App was also use to display the tweets sentiment trends to prevent crime in near-real time.Exploratory analysis is essential for revealing the necessary data pre-processing and feature engineering needed prior to feeding the data into the machine learning model for efficient result. Based on earlier documented studies available, this is the first research to do a full exploratory analysis of historical British crime statistics using stop and search historical dataset. Also, based on the findings from the exploratory study, an algorithm was created to clean the data, and prepare it for further analysis and model creation. This is an enormous success because it provides a perfect dataset for future research, particularly for non-experts to utilise in constructing models to forecast crime or conducting investigations in around 32 police districts of the United Kingdom.Moreover, this study is the first study to present a complete collection of geo-spatial parameters for training a crime prediction model by combining demographic data from the same source in the United Kingdom with hourly sentiment polarity that was not restricted to Twitter keyword search. Six unique base models that were frequently mentioned in the previous literature was selected and used to train stop-and-search historical crime dataset and evaluated on test data and finally validated with dataset from London and Kent crime datasets.Two different datasets were created from twitter and historical data (historical crime data with twitter sentiment score and historical data without twitter sentiment score). Six of the most prevalent machine learning classifiers (Random Forest, Decision Tree, K-nearest model, support vector machine, neural network and naïve bayes) were trained and tested on these datasets. Additionally, hyperparameters of each of the six models developed were tweaked using random grid search. Voting classifiers and logistic regression stacked ensemble of different models were also trained and tested on the same datasets to enhance the individual model performance.In addition, two combinations of stack ensembles of multiple models were constructed to enhance and choose the most suitable models for crime prediction, and based on their performance, the appropriate prediction model for the UK dataset would be selected. In terms of how the research may be interpreted, it differs from most earlier studies that employed Twitter data in that several methodologies were used to show how each attribute contributed to the construction of the model, and the findings were discussed and interpreted in the context of the study. Further, a shiny app visualisation tool was designed to display the tweets’ sentiment score, the text, the users’ screen name, and the tweets’ vicinity which allows the investigation of any criminal actions in near-real time. The evaluation of the models revealed that Random Forest, Decision Tree, and K nearest neighbour outperformed other models. However, decision trees and Random Forests perform better consistently when evaluated on test data
    corecore