8,071 research outputs found

    High-Resolution Road Vehicle Collision Prediction for the City of Montreal

    Full text link
    Road accidents are an important issue of our modern societies, responsible for millions of deaths and injuries every year in the world. In Quebec only, in 2018, road accidents are responsible for 359 deaths and 33 thousands of injuries. In this paper, we show how one can leverage open datasets of a city like Montreal, Canada, to create high-resolution accident prediction models, using big data analytics. Compared to other studies in road accident prediction, we have a much higher prediction resolution, i.e., our models predict the occurrence of an accident within an hour, on road segments defined by intersections. Such models could be used in the context of road accident prevention, but also to identify key factors that can lead to a road accident, and consequently, help elaborate new policies. We tested various machine learning methods to deal with the severe class imbalance inherent to accident prediction problems. In particular, we implemented the Balanced Random Forest algorithm, a variant of the Random Forest machine learning algorithm in Apache Spark. Interestingly, we found that in our case, Balanced Random Forest does not perform significantly better than Random Forest. Experimental results show that 85% of road vehicle collisions are detected by our model with a false positive rate of 13%. The examples identified as positive are likely to correspond to high-risk situations. In addition, we identify the most important predictors of vehicle collisions for the area of Montreal: the count of accidents on the same road segment during previous years, the temperature, the day of the year, the hour and the visibility

    An Analysis of the Predictive Capability of C5.0 and Chaid Decision Trees and Bayes Net in the Classification of fatal Traffic Accidents in the UK

    Get PDF
    Road traffic accidents are a significant cause of deaths worldwide and there is a global focus on understanding accident contributory factors and implementing prevention strategies. Although accident statistics are steadily improving, effective prevention must be persistent, evidence based and properly resourced. This research aimed to extract fatal traffic accident prediction from UK STATS19 accident data using C5.0 and Chaid decision trees and Bayes net classification models. Data was grouped as either fatal or non-fatal. The class imbalance due to fatal accident infrequency was considered and data transformation and sampling techniques were applied to increase prediction likelihood. Chaid was used for supervised discretisation and proved effective in identifying homogeneous subgroups. SPSS Modeler was used for data preparation and model build. Model performance was evaluated using accuracy, recall, precision and ROC curves. The experiment design and data preparation approach successfully predicted fatal accidents with high recall results, however, significant misclassification of non-fatals as fatals led to poor accuracy and precision performance. Boosting was subsequently tested and achieved some accuracy improvement. Serious accidents were grouped as non-fatal in the initial data analysis, however, are likely to hold similar characteristics to fatal and the models therefore struggled to classify correctly as non-fatal. Changing the experiment design to select fatal, serious and slight as targets may improve the models accuracy. Overall, the models succeeded in classifying fatal traffic accidents correctly and this was the original objective of the research. Interpretation of business rules, by ranking rules and summarising in a standard format, proved effective for understanding and comparison of key predictors. When comparing both C5.0 and Bayes net models, the contributory factors identified were consistent, with road surface and urban/rural identified as the strongest predictors for both models. The experiment demonstrated that classification techniques can be used to predict infrequent events once sampling techniques are applied

    Road Traffic Accidents Severity Modelling in Malawi

    Get PDF
    One of the significant problems Malawi faces today is the rate at which road traffic accidents and deaths are happening on the roads of Malawi. It is very crucial to effectively address such a problem with a limited budget considering that Malawi is a developing country. To supplement the current safety measures, traffic accidents data mining using machine learning models was considered. Being able to predict the severity of an accident as well as determining the weight each attribute contributes to the severity could help authorities make informed decisions. Therefore, this research aimed at modeling the severity of road accidents in Malawi to help reduce traffic accidents or the severity with limited budgetary resources. Using python, three classification algorithms were employed to model the severity of an accident. The algorithms included; Decision trees, Logistic regression and Support Vector Machines. These models were evaluated using accuracy, precision, recall, and F1-score. The logistic regression performed better than the other two and after fitting the model it was discovered that the top three attributes that contributed to fatal accidents were accidents involving a moving vehicle and a pedestrian, accidents that occurred at Dawn or Dust, and accidents involving a moving vehicle and a bicycle

    Nuevo marco para utilizar la minería de datos y reglas de asociación para la clasificación de la gravedad de accidentes de tráfico

    Get PDF
    Introduction: Traffic accidents are an undesirable burden on society. Every year around one million deaths and more than ten million injuries are reported due to traffic accidents. Hence, traffic accidents prevention measures must be taken to overcome the accident rate. Different countries have different geographical and environmental conditions and hence the accident factors diverge in each country. Traffic accident data analysis is very useful in revealing the factors that affect the accidents in different countries. This article was written in the year 2016 in the Institute of Technology & Science, Mohan Nagar, Ghaziabad, up, India. Methology: We propose a framework to utilize association rule mining (arm) for the severity classification of traffic accidents data obtained from police records in Mujjafarnagar district, Uttarpradesh, India. Results: The results certainly reveal some hidden factors which can be applied to understand the factors behind road accidentality in this region. Conclusions: The framework enables us to find three clusters from the data set. Each cluster represents a type of accident severity, i.e. fatal, major injury and minor/no injury. The association rules exposed different factors that are associated with road accidents in each category. The information extracted provides important information which can be employed to adapt preventive measures to overcome the accident severity in Muzzafarnagar district.Introducción: los accidentes de tránsito son una carga indeseable para la sociedad. Cada año se reportan alrededor de un millón de muertes y más de diez millones de lesiones debido a accidentes de tráfico. Por lo tanto, se deben implementar medidas de prevención de accidentes de tráfico para superar la tasa de accidentalidad. Los países tienen diferentes condiciones geográficas y ambientales y, por ello, las variables que inciden varían en cada país. El análisis de los datos de accidentes de tráfico es muy útil para revelar los factores o variables que inciden en la accidentalidad en diferentes países. Este artículo fue escrito en el 2016 en el Instituto de Tecnología y Ciencia, Mohan Nagar, Ghaziabad, UP, India. Metodología: proponemos un marco para utilizar la minería de datos y reglas de asociación (arm) para la clasificación de severidad de los datos de accidentes de tráfico obtenidos de registros policiales en eldistrito de Mujjafarnagar, Uttarpradesh, India Resultados: los resultados revelan ciertamente algunos factores ocultos que se pueden aplicar para entender las variables detrás de la accidentalidad de tráfico en esta región. Conclusiones: el marco permite establecer tres categorías en el conjunto de datos que representan el tipo de gravedad del accidente: fatal, lesiones graves, y lesiones menores o inexistentes. Las reglas de asociación expusieron diferentes factores relacionados con los accidentes de tráfico en cada categoría. Los datos extraídos proporcionan información importante que se puede emplear para adaptar las medidas preventivas para superar la gravedad de los accidentes de tráfico en el distrito de Muzzafarnagar

    The role of human factor in incidence and severity of road crashes based on the CART and LR regression: a data mining approach

    Get PDF
    AbstractAccidents are one of the biggest public health problems in the world. As literature indicated, the traffic accidents were assessed to be most significant health problem in Iran. To date, no serious researches have analyzed high dimensional traffic data In Iran. This paper, therefore, aims to bridge the gap. In this study, the traffic data are analyzed by Data Mining techniques such as Logistic Regression, Classification and Regression Trees. In this paper the impact of such factors were investigated using these techniques. It is hoped that the current research findings will help governments in better road designs and traffic management

    Roadway Traffic Analysis using Data Mining Techniques for Providing Safety Measures to Avoid Fatal Accidents

    Get PDF
    Roadway traffic safety is a major concern for transportation governing agencies as well as ordinary citizens.Data Mining is taking out of hidden patterns from huge database. It is commonly used in a marketing, surveillance, fraud detection and scientific discovery. In data mining, machine learning is mainly focused as research which is automatically learnt to recognize complex patterns and make intelligent decisions based on data. Globalization has affected many countries. There has been a drastic increase in the economic activities and consumption level, leading to expansion of travel and transportation. The increase in the vehicles, traffic lead to road accidents. Considering the importance of the road safety, government is trying to identify the causes of road accidents to reduce the accidents level. The exponential increase in the accidents data is making it difficult to analyse the constraints causing the road accidents. The paper describes how to mine frequent patterns causing road accidents from collected data set. We find associations among road accidents and predict the type of accidents for existing as well as for new roads. We make use of association and classification rules to discover the patterns between road accidents and as well as predict road accidents for new roads

    Identifying Road Accidents Severity Problems Using Data Mining Approaches

    Get PDF
    Roadway traffic safety is a major concern for transportation governing agencies as well as ordinarycitizens. In order to give safe driving suggestions, carefulanalysis of roadway traffic data is critical to find outvariables that are closely related to fatal accidents. Inthis paper we apply statistics analysis and data miningalgorithms on the FARS Fatal Accident dataset as an attempt to address this problem. The relationship betweenfatal rate and other attributes including collision manner,weather, surface condition, light condition, and drunkdriver were investigated. Association rules were discoveredby Apriori algorithm, classification model was built byNaive Bayes classifier, and clusters were formed by simple K-means clustering algorithm. Here we are also using one more classification technique for comparing with Naïve bayes classifier. Certain safety driving suggestions were made based on statistics, association rules, classification model, and clusters obtained

    Data Mining as a Method for Comparison of Traffic Accidents in Şişli District of Istanbul

    Get PDF
    Studies to reduce traffic accidents are of great importance, especially for metropolitan cities. One of these metropolitan cities is undoubtedly Istanbul. In this study, a perspective on reducing traffic accidents was trying to be revealed by analyzing 3833 fatal and injury traffic accidents that occurred in the Şişli district of Istanbul between 2010-2017, with Data Mining (DM), Machine Learning (ML) and Geographic Information Systems methods (GIS), as well as traditional methods. It is aimed to visually determine the streets where traffic accidents are concentrated, to examine whether the accidents show anomalies according to the effect of the days of the week, to examine the differences according to the accidents that occur in the regions and to develop a model. For this purpose Kernel Density, decision trees, artificial neural networks, logistic regression and Naive Bayes methods were used. From the results obtained, it has been seen that some days are different from other days in terms of traffic accidents, according to the accident intensities and the performances of the modelling techniques used vary according to the regions. This study revealed that the ‘day of the week effect’ can also be applied to traffic accident
    corecore