5 research outputs found

    Towards Crime Prevention Using Big Data Analytics: A Literature Review with an Explorative Case Study

    Get PDF
    Since the popularization of the big data concept, it has been implemented in various areas. Contemporary literature has proved the potential of using big data in crime prevention. In this research paper, we examine research on big data being used in Crime Prevention while implementing an author-centric to a concept-centric research approach. We also present the foundation for future research by analyzing data derived from the City of Chicago. We identified the neighborhoods in Chicago that are statistically more prone to crimes and used data of the last 10 years to make our observations. We additionally provide further discussion points for future research purpose

    The use of big data and data mining in the investigation of criminal offences

    Get PDF
    The aim of this study was to determine the features and prospects of using Big Data and Data Mining in criminal proceedings. The research involved the methods of a systematic approach, descriptive analysis, systematic sampling, formal legal approach and forecasting. The object of using Big Data and Data Mining are various crimes, the common features of which are the seriousness and complexity of the investigation. The common tools of Big Data and Data Mining in crime investigation and crime forecasting as interrelated tasks were identified. The creation of databases is the result of the processing of data sources by Data Mining methods, each being distinguished by the specifics of use. The main risks of implementing Big Data and Data Mining are violations of human rights and freedoms. Improving the use of Big Data and Data Mining requires standardization of procedures with strict adherence to the fundamental ethical, organizational and procedural rules. The use of Big Data and Data Mining is a forensic innovation in the investigation of serious crimes and the creation of an evidence base for criminal justice. The prospects for widespread use of these methods involve the standardization of procedures based on ethical, organizational and procedural principles. It is appropriate to outline these procedures in framework practical recommendations, emphasizing the responsibility of officials in case of violation of the specified principles. The area of further research is the improvement of innovative technologies and legal regulation of their application

    Predicting Violent Crime Reports from Geospatial and Temporal Attributes of US 911 Emergency Call Data

    Get PDF
    The aim of this study is to create a model to predict which 911 calls will result in crime reports of a violent nature. Such a prediction model could be used by the police to prioritise calls which are most likely to lead to violent crime reports. The model will use geospatial and temporal attributes of the call to predict whether a crime report will be generated. To create this model, a dataset of characteristics relating to the neighbourhood where the 911 call originated will be created and combined with characteristics related to the time of the 911 call. Geospatial and temporal analysis of past 911 calls and crime reports will be applied to determine which 911 calls resulted in crime reports (the dependent variable) so that supervised learning can be performed

    Analyzing and predicting spatial crime distribution using crowdsourced and open data

    No full text
    Data analytics has an ever increasing impact on tackling various societal challenges. In this article, we investigate how data from several heterogeneous online sources can be used to discover insights and make predictions about the spatial distribution of crime in large urban environments. A series of important research questions is addressed, following a purely data-driven approach and methodology. First, we examine how useful different types of data are for the task of crime levels prediction, focusing especially on how prediction accuracy can be improved by combining data from multiple information sources. To that end, we not only investigate prediction accuracy across all individual areas studied, but also examine how these predictions affect the accuracy of identified crime hotspots. Then, we look into individual features, aiming to identify and quantify the most important factors. Finally, we drill down to different crime types, elaborating on how the prediction accuracy and the importance of individual features vary across them. Our analysis involves six different datasets, from which more than 3,000 features are extracted, filtered, and used to learn models for predicting crime rates across 14 different crime categories. Our results indicate that combining data from multiple information sources can significantly improve prediction accuracy. They also highlight which features affect prediction accuracy the most, as well as for which particular crime categories the predictions are more accurate. © 2018 ACM

    Real Time Crime Prediction Using Social Media

    Get PDF
    There is no doubt that crime is on the increase and has a detrimental influence on a nation's economy despite several attempts of studies on crime prediction to minimise crime rates. Historically, data mining techniques for crime prediction models often rely on historical information and its mostly country specific. In fact, only a few of the earlier studies on crime prediction follow standard data mining procedure. Hence, considering the current worldwide crime trend in which criminals routinely publish their criminal intent on social media and ask others to see and/or engage in different crimes, an alternative, and more dynamic strategy is needed. The goal of this research is to improve the performance of crime prediction models. Thus, this thesis explores the potential of using information on social media (Twitter) for crime prediction in combination with historical crime data. It also figures out, using data mining techniques, the most relevant feature engineering needed for United Kingdom dataset which could improve crime prediction model performance. Additionally, this study presents a function that could be used by every state in the United Kingdom for data cleansing, pre-processing and feature engineering. A shinny App was also use to display the tweets sentiment trends to prevent crime in near-real time.Exploratory analysis is essential for revealing the necessary data pre-processing and feature engineering needed prior to feeding the data into the machine learning model for efficient result. Based on earlier documented studies available, this is the first research to do a full exploratory analysis of historical British crime statistics using stop and search historical dataset. Also, based on the findings from the exploratory study, an algorithm was created to clean the data, and prepare it for further analysis and model creation. This is an enormous success because it provides a perfect dataset for future research, particularly for non-experts to utilise in constructing models to forecast crime or conducting investigations in around 32 police districts of the United Kingdom.Moreover, this study is the first study to present a complete collection of geo-spatial parameters for training a crime prediction model by combining demographic data from the same source in the United Kingdom with hourly sentiment polarity that was not restricted to Twitter keyword search. Six unique base models that were frequently mentioned in the previous literature was selected and used to train stop-and-search historical crime dataset and evaluated on test data and finally validated with dataset from London and Kent crime datasets.Two different datasets were created from twitter and historical data (historical crime data with twitter sentiment score and historical data without twitter sentiment score). Six of the most prevalent machine learning classifiers (Random Forest, Decision Tree, K-nearest model, support vector machine, neural network and naïve bayes) were trained and tested on these datasets. Additionally, hyperparameters of each of the six models developed were tweaked using random grid search. Voting classifiers and logistic regression stacked ensemble of different models were also trained and tested on the same datasets to enhance the individual model performance.In addition, two combinations of stack ensembles of multiple models were constructed to enhance and choose the most suitable models for crime prediction, and based on their performance, the appropriate prediction model for the UK dataset would be selected. In terms of how the research may be interpreted, it differs from most earlier studies that employed Twitter data in that several methodologies were used to show how each attribute contributed to the construction of the model, and the findings were discussed and interpreted in the context of the study. Further, a shiny app visualisation tool was designed to display the tweets’ sentiment score, the text, the users’ screen name, and the tweets’ vicinity which allows the investigation of any criminal actions in near-real time. The evaluation of the models revealed that Random Forest, Decision Tree, and K nearest neighbour outperformed other models. However, decision trees and Random Forests perform better consistently when evaluated on test data
    corecore