95 research outputs found

    Deep Learning, Machine Learning, or Statistical Models for Weather-related Crash Severity Prediction

    Get PDF
    Nearly 5,000 people are killed and more than 418,000 are injured in weather-related traffic incidents each year. Assessments of the effectiveness of statistical models applied to crash severity prediction compared to machine learning (ML) and deep learning techniques (DL) help researchers and practitioners know what models are most effective under specific conditions. Given the class imbalance in crash data, the synthetic minority over-sampling technique for nominal (SMOTE-N) data was employed to generate synthetic samples for the minority class. The ordered logit model (OLM) and the ordered probit model (OPM) were evaluated as statistical models, while random forest (RF) and XGBoost were evaluated as ML models. For DL, multi-layer perceptron (MLP) and TabNet were evaluated. The performance of these models varied across severity levels, with property damage only (PDO) predictions performing the best and severe injury predictions performing the worst. The TabNet model performed best in predicting severe injury and PDO crashes, while RF was the most effective in predicting moderate injury crashes. However, all models struggled with severe injury classification, indicating the potential need for model refinement and exploration of other techniques. Hence, the choice of model depends on the specific application and the relative costs of false negatives and false positives. This conclusion underscores the need for further research in this area to improve the prediction accuracy of severe and moderate injury incidents, ultimately improving available data that can be used to increase road safety

    Parametric and Non-Parametric Analyses for Pedestrian Crash Severity Prediction in Great Britain

    Get PDF
    The study aims to investigate the factors that are associated with fatal and severe vehicle– pedestrian crashes in Great Britain by developing four parametric models and five non-parametric tools to predict the crash severity. Even though the models have already been applied to model the pedestrian injury severity, a comparative analysis to assess the predictive power of such modeling techniques is limited. Hence, this study contributes to the road safety literature by comparing the models by their capabilities of identifying the significant explanatory variables, and by their performances in terms of the F-measure, the G-mean, and the area under curve. The analyses were carried out using data that refer to the vehicle–pedestrian crashes that occurred in the period of 2016–2018. The parametric models confirm their advantages in offering easy-to-interpret outputs and understandable relations between the dependent and independent variables, whereas the non-parametric tools exhibited higher classification accuracies, identified more explanatory variables, and provided insights into the interdependencies among the factors. The study results suggest that the combined use of parametric and non-parametric methods may effectively overcome the limits of each group of methods, with satisfactory prediction accuracies and the interpretation of the factors contributing to fatal and serious crashes. In the conclusion, several engineering, social, and management pedestrian safety countermeasures are recommended

    Evaluation of Parametric and Nonparametric Statistical Models in Wrong-way Driving Crash Severity Prediction

    Get PDF
    Wrong-way driving (WWD) crashes result in more fatalities per crash, involve more vehicles, and cause extended road closures compared to other types of crashes. Although crashes involving wrong-way drivers are relatively few, they often lead to fatalities and serious injuries. Researchers have been using parametric statistical models to identify factors that affect WWD crash severity. However, these parametric models are generally based on several assumptions, and the results could generate numerous errors and become questionable when these assumptions are violated. On the other hand, nonparametric methods such as data mining or machine learning techniques do not use a predetermined functional form, can address the correlation problem among independent variables, display results graphically, and simplify the potential complex relationship between the variables. The main objective of this research was to demonstrate the applicability of nonparametric statistical models in successfully identifying factors affecting traffic crash severity. To achieve this goal, the performance of parametric and nonparametric statistical models in WWD crash severity prediction was evaluated. The following parametric methods were evaluated: Logistic Regression (LR), Ridge Regression (RR), Least Absolute Shrinkage and Selection Operator (LASSO), Linear Discriminant Analysis (LDA), and Gaussian Naïve Bayes (GNB). The following nonparametric methods were evaluated: Random Forests (RF), Decision Trees (DT), and Support Vector Machine (SVM). The evaluation was based on sensitivity, specificity, and prediction accuracy. The research also demonstrated the applicability of nonparametric supervised learning algorithms on crash severity analysis by combining tree-based data mining techniques and marginal effect analysis to show the correlation between the response and the predictor variables. The analysis was based on 1,475 WWD crashes that occurred on arterial road networks from 2012-2016 in Florida. The results showed that nonparametric models provided better prediction accuracy on predicting serious injury compared to parametric models. By conducting prediction accuracy comparison, contributor variables’ marginal effect analysis, variable importance evaluation, and crash severity pattern recognition analysis, the nonparametric models have been demonstrated to be valid and proved to serve as an alternative tool in transportation safety studies. The results showed that head-on collisions, weekends, high-speed facilities, crashes involving vehicles entering from a driveway, dark-not lighted roadways, older drivers, and driver impairment are important factors that play a crucial role in WWD crash severity on non-limited access facilities. This information may assist researchers and safety engineers in identifying specific strategies to reduce the severity of WWD crashes on arterial streets. Besides unveiling the factors contributing to WWD crash severity and their relationship with each other, this research has demonstrated the potential of using data mining techniques in yielding results that are easily understandable and interpretable

    Severity Analysis of Large Truck Crashes- Comparision Between the Regression Modeling Methods with Machine Learning Methods.

    Get PDF
    According to the Texas Department of Transportation’s Texas Motor Vehicle Crash Statistics, Texas has had the highest number of severe crashes involving large trucks in the US. As defined by the US Department of Transportation, a large truck is any vehicle with a gross vehicle weight rating greater than 10,000 pounds. Generally, it requires more time and much more space for large trucks to accelerating, slowing down, and stopping. Also, there will be large blind spots when large trucks make wide turns. Therefore, if an unexpected traffic situation comes upon, It would be more difficult for large trucks to take evasive actions than regular vehicles to avoid a collision. Due to their large size and heavy weight, large truck crashes often result in huge economic and social costs. Predicting the severity level of a reported large truck crash with unknown severity or of the severity of crashes that may be expected to occur sometime in the future is useful. It can help to prevent the crash from happening or help rescue teams and hospitals provide proper medical care as fast as possible. To identify the appropriate modeling approaches for predicting the severity of large truck crash, in this research, four representative classification tree-based ML models (e.g., Extreme Gradient Boosting tree (XGBoost), Adaptive Boosting tree(AdaBoost), Random Forest (RF), Gradient Boost Decision Tree (GBDT)), two non-tree-based ML models (e.g., the Support Vector Machines (SVM), k-Nearest Neighbors (kNN)), and LR model were selected. The results indicate that the GBDT model performs best among all of seven models

    Identifying and quantifying factors affecting traffic crash severity in Louisiana

    Get PDF
    This study was conducted to identify and quantify the factors affecting highway crash severity in Louisiana. Three candidate models were fit to the crash data to compare their performance and the Ordered Mixed Logit (OML) model was selected as the crash severity prediction model of choice. The factors contributing to crash severity identified by the OML model are: age and gender of the driver, vehicle speed, whether alcohol played a role in the crash, whether seatbelts were used, whether the driver was ejected from the vehicle, whether the crash was a head-on collision, whether an airbag was deployed, and whether one of the vehicles was following too close behind another vehicle. Among the nine contributing factors, alcohol involvement, seatbelt use, and speed are most readily altered by a safety policy or countermeasure. Thus, a detailed analysis was conducted to analyze the impact of these factors on crash severity since they lend themselves to alteration. The following conclusions were presented by the study: for every ten percent drop in alcohol-related crashes, 4.5 % fewer fatalities and 8.7 % fewer serious injuries were predicted to occur. Proportionally, the reduction in fatalities was 5 times higher among young male drivers than the rest of the population; a 10 percent increase of seatbelt use can lead to an 8.4 percent reduction of fatal crashes and more than 6 percent decline of severe injury crashes. Targeting young male drivers and uninsured drivers would be conceivably more efficient in terms of effort per driver than applying countermeasures to all drivers; reducing the maximum speed can greatly reduce fatal crashes whereas reducing average speed can reduce the fatal and all injury crashes. The characteristics of the repeat DUI offenders and the repeat crash takers were also analyzed in the study. Based on the analysis results, safety policies and countermeasures such as a point system were identified to remedy the existing safety problems and reduce the overall crash severity. How to estimate the benefit of a safety policy is addressed at last

    Integrated Accident Resilience Framework (IARF) – A Theoretical Approach Using Spatial and Statistical Analysis

    Get PDF
    Throughout the world, road accidents have become a nightmare for any local government. Data shows that every 24 seconds someone dies on the road (WHO, 2018). Generally, there are multiple factors causing road accidents such as traffic volumes/composition, speed, infrastructure conditions, climatic conditions, and vehicle factors etc. Through this paper, an effort has been made to bring an effective Integrated Accident Resilience Framework (IARF). The framework is in the form of a theoretical method which may help transportation agencies and governments to develop a practical system for crash analysis and mitigation. The Integrated Accident Resilience Framework (IARF) showcased in this paper consists of different stages such as data collection, storage, and analysis, which help to compute correlations between crash causational parameters and crash frequency. The tools used to perform the analysis functions in the framework consist of the GIS platform, as well as the application of the negative binomial regression model. The computed results help identify the major influencing parameters that are linked to traffic accidents and their contribution to crash frequency in black spot locations. This can be used to mitigate future crashes by taking appropriate remedial measures in collision-prone regions. The methodology presented can also be scaled up to a city level network. The entire transportation network can be spatially marked to develop a resilient accident management strategy; even a real-time also

    Short-term crash risk prediction considering proactive, reactive, and driver behavior factors

    Get PDF
    Providing a safe and efficient transportation system is the primary goal of transportation engineering and planning. Highway crashes are among the most significant challenges to achieving this goal. They result in significant societal toll reflected in numerous fatalities, personal injuries, property damage, and traffic congestion. To that end, much attention has been given to predictive models of crash occurrence and severity. Most of these models are reactive: they use the data about crashes that have occurred in the past to identify the significant crash factors, crash hot-spots and crash-prone roadway locations, analyze and select the most effective countermeasures for reducing the number and severity of crashes. More recently, the advancements have been made in developing proactive crash risk models to assess short-term crash risks in near-real time. Such models could be applied as part of traffic management strategies to prevent and mitigate the crashes. The driver behavior is found to be the leading cause of highway crashes. Nevertheless, due to data unavailability, limited studies have explored and quantified the role of driver behavior in crashes. The Strategic Highway Research Program Naturalistic Driving Study (SHRP 2 NDS) offers an unprecedented opportunity to perform an in-depth analysis of the impacts of driver behavior on crashes events. The research presented in this dissertation is divided into three parts, corresponding to the research objectives. The first part investigates the application of advanced data modeling methods for proactive crash risk analysis. Several proactive models for segment level crash risk and severity assessment are developed and tested, considering the proactive data available to most transportation agencies in real time at a regional network scale. The data include roadway geometry characteristics, traffic flow characteristics, and weather condition data. The analysis methods include Random-effect Bayesian Logistics Regression, Random Forest, Gradient Boosting Machine, K-Nearest Neighbor, Gaussian Naive Bayes (GNB), and Multi-layer Feedforward Deep Neural Network (MLFDNN). The random oversampling technique is applied to deal with the problem of data imbalance associated with the injury severity analysis. The model training and testing are completed using a dataset containing records of 10,155 crashes that occurred on two interstate highways in New Jersey over a period of two years. The second part of the study analyzes the potential improvement in the prediction abilities of the proposed models by adding reactive data (such as vehicle characteristics and driver characteristics) to the analysis. Commonly, the reactive data is only available (known) after the crash occurs. In the proposed research, the crash analysis is performed by classifying crashes in multiple groupings (instead of a single group), constructed based on the age of drivers and vehicles to account for the impact of reactive data on driver injury severity outcomes. The results of the second part of the study show that while the simultaneous use of reactive and proactive data can improve the prediction performance of the models, the absolute crash probability values must be further improved for operational crash risk prediction. To this end, in the third part of the study, the Naturalistic Driving Study data is used to calibrate the crash risk models, including the driver behavior risk factors. The findings show significant improvement in crash prediction accuracy with the inclusion of driver behavior risk factors, which confirms the driver behavior to be the most critical risk factor affecting the crash likelihood and the associated injury severity

    The Model of Severity Prediction of Traffic Crash on the Curve

    Get PDF
    With the study of traffic crashes on curved road segments as the focus of research, a logistic regression based curve road crash severity prediction model was established based on a sample crash database of 20000 entries collected from 4 regions of China and 15 evaluation indicators involving driver, driving environment, and traffic environment factors. Maximum Likelihood Estimation and step-back technique were deployed for data analysis, the conclusion of which is that the three main contributory factors on curve road crash severity are weather, roadside protection facility, and pavement structure. Hosmer and Lemeshow tests were used to verify the reliability of the model, and the model variables were discussed to a certain degree as well
    • …
    corecore