Pedestrian crash severity prediction and contributory factors analysis by using machine learning methods

Abstract

Pedestrians occupy a leading position among the most vulnerable road users. Each year about 270,000 pedestrians die due to road accidents, so this study aims to highlight the most influencing contributory factors and the most promising models to predict pedestrian crash severity. ISTAT data for the City of Rome (2013–2020) are used and different Machine Learning Methods are trained and tested, after balancing the data with oversampling techniques. In addition, analysis of the most influencing contributory factor is carried out, by using the ROC curve method, Variable Importance Analysis (VIP), and Support Vector Machine with a Linear Kernel. The findings suggest that the model with the best prediction performance is the Random Forest, followed by the Decision Tree and k-nearest neighbour algorithm. Regarding the analysis of contributory factors, the methods implemented highlight that the hour in which the accident occurs, pedestrian gender, and age seem to be the most critical factors that increase the severity of a pedestrian crash. There are also some limitations in this study: the first is connected to the black-box nature of these models; the second regards how these variables could influence positively or negatively the outcome

Similar works

Full text

thumbnail-image

ART

redirect
Last time updated on 05/01/2026

This paper was published in ART.

Having an issue?

Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.