Search CORE

4,015 research outputs found

Addressing Transportation Equity by Comparing In-Service Performance of Roadside Safety Devices through Machine Learning Modeling

Author: Wang Hanzhen Wang
Publication venue: Digital Scholarship @ Texas Southern University
Publication date: 01/08/2021
Field of study

Transportation equity plays an important role in modern communities, and a fair distribution of transportation infrastructures is vital as an integral part of transportation planning process. The In-Service Performance Evaluation (ISPE) satisfies transportation safety requirements by identifying the problems of roadside safety devices during installation and maintenance process with proper solutions, and the performance results reveal the current statue of target devices in specific areas. Although several studies have been conducted to emphasize transportation equity, there is still a lack of equity research specifically focusing on the deploying of roadside safety devices associated with ISPE results. With proper comparison of in-service performance results in different areas, the importance of ensuring transportation equity of all communities and areas in the decision-making process is able to be demonstrated. This thesis utilizes Machine Learning models to analyze linked crash and roadway data related to major roadside safety devices implemented in Texas. Three typical roadside safety devices are selected to be assessed, including: (1) guardrail, (2) median barrier, and (3) bridge rail. By comparing both statistical and Machine Learning based modeling analysis with rural and metropolitan areas in specific counties, it is demonstrated that distributions of crashes that end up causing heavy property damage or serious injuries is higher in rural communities regardless of its lower crash frequency. The data analysis result suggests that parameters related to roadway conditions and transportation infrastructures tend to have higher influence over the performances of rural safety devices. Additional one year of crash data analysis also addresses the importance of transportation equity under the COVID-19 pandemic period. Recommendations on improving overall equity and Environmental Justice (EJ) within all regions are conducted with stated findings

Texas Southern University, School of Public Affairs: Digital Scholarship

Severity Analysis of Large Truck Crashes- Comparision Between the Regression Modeling Methods with Machine Learning Methods.

Author: Liu Jinli
Publication venue: Digital Scholarship @ Texas Southern University
Publication date: 01/08/2021
Field of study

According to the Texas Department of Transportation’s Texas Motor Vehicle Crash Statistics, Texas has had the highest number of severe crashes involving large trucks in the US. As defined by the US Department of Transportation, a large truck is any vehicle with a gross vehicle weight rating greater than 10,000 pounds. Generally, it requires more time and much more space for large trucks to accelerating, slowing down, and stopping. Also, there will be large blind spots when large trucks make wide turns. Therefore, if an unexpected traffic situation comes upon, It would be more difficult for large trucks to take evasive actions than regular vehicles to avoid a collision. Due to their large size and heavy weight, large truck crashes often result in huge economic and social costs. Predicting the severity level of a reported large truck crash with unknown severity or of the severity of crashes that may be expected to occur sometime in the future is useful. It can help to prevent the crash from happening or help rescue teams and hospitals provide proper medical care as fast as possible. To identify the appropriate modeling approaches for predicting the severity of large truck crash, in this research, four representative classification tree-based ML models (e.g., Extreme Gradient Boosting tree (XGBoost), Adaptive Boosting tree(AdaBoost), Random Forest (RF), Gradient Boost Decision Tree (GBDT)), two non-tree-based ML models (e.g., the Support Vector Machines (SVM), k-Nearest Neighbors (kNN)), and LR model were selected. The results indicate that the GBDT model performs best among all of seven models

Texas Southern University, School of Public Affairs: Digital Scholarship

Accident prediction using machine learning:analyzing weather conditions, and model performance

Author: Abbas M.S. (Muhammad Shahroz)
Publication venue: University of Oulu
Publication date: 15/06/2023
Field of study

Abstract. The primary focus of this study was to investigate the impact of weather and road conditions on the severity of accidents and to determine the feasibility of machine learning models in accurately predicting the likelihood of such incidents. The research was centered on two key research questions. Firstly, the study examined the influence of weather and road conditions on accident severity and identified the most related factors contributing to accidents. We utilized an open-source accident dataset, which was preprocessed using techniques like variable selection, missing data elimination, and data balancing through the Synthetic Minority Over-sampling Technique (SMOTE). Chi-square statistical analysis was performed, suggesting that all weather-related variables are more or less associated with the severity of accidents. Visibility and temperature were found to be the most critical factors affecting the severity of road accidents. Hence, appropriate measures such as implementing effective fog dispersal systems, heatwave alerts, or improved road maintenance during extreme temperatures could help reduce accident severity. Secondly, the research evaluated the ability of machine learning models including decision trees, random forests, naive bayes, extreme gradient boost, and neural networks to predict accident likelihood. The models’ performance was gauged using metrics like accuracy, precision, recall, and F1 score. The Random Forest model emerged as the most reliable and accurate model for predicting accidents, with an overall accuracy of 98.53%. The Decision Tree model also showed high overall accuracy (95.33%), indicating its reliability. However, the Naive Bayes model showed the lowest accuracy (63.31%) and was deemed less reliable in this context. It is concluded that machine learning models can be effectively used to predict the likelihood of accidents, with models like Random Forest and Decision Tree proving the most effective. However, the effectiveness of each model may vary depending on the dataset and context, necessitating further testing and validation for real-world implementation. These findings not only provide insight into the factors affecting accident severity but also open a promising avenue in employing machine learning techniques for proactive accident prediction and mitigation. Future studies can aim to refine the models further and potentially integrate them into traffic management systems to enhance road safety

University of Oulu Repository - Jultika

A novel one-vs-rest consensus learning method for crash severity prediction

Author: Ashraf Muhammad Mansoor
Hussain Syed Fawad
Publication venue: 'Elsevier BV'
Publication date: 15/10/2023
Field of study

University of Birmingham Research Portal

Short-term crash risk prediction considering proactive, reactive, and driver behavior factors

Author: Darban Khales Sina
Publication venue: Digital Commons @ NJIT
Publication date: 31/08/2021
Field of study

Providing a safe and efficient transportation system is the primary goal of transportation engineering and planning. Highway crashes are among the most significant challenges to achieving this goal. They result in significant societal toll reflected in numerous fatalities, personal injuries, property damage, and traffic congestion. To that end, much attention has been given to predictive models of crash occurrence and severity. Most of these models are reactive: they use the data about crashes that have occurred in the past to identify the significant crash factors, crash hot-spots and crash-prone roadway locations, analyze and select the most effective countermeasures for reducing the number and severity of crashes. More recently, the advancements have been made in developing proactive crash risk models to assess short-term crash risks in near-real time. Such models could be applied as part of traffic management strategies to prevent and mitigate the crashes. The driver behavior is found to be the leading cause of highway crashes. Nevertheless, due to data unavailability, limited studies have explored and quantified the role of driver behavior in crashes. The Strategic Highway Research Program Naturalistic Driving Study (SHRP 2 NDS) offers an unprecedented opportunity to perform an in-depth analysis of the impacts of driver behavior on crashes events. The research presented in this dissertation is divided into three parts, corresponding to the research objectives. The first part investigates the application of advanced data modeling methods for proactive crash risk analysis. Several proactive models for segment level crash risk and severity assessment are developed and tested, considering the proactive data available to most transportation agencies in real time at a regional network scale. The data include roadway geometry characteristics, traffic flow characteristics, and weather condition data. The analysis methods include Random-effect Bayesian Logistics Regression, Random Forest, Gradient Boosting Machine, K-Nearest Neighbor, Gaussian Naive Bayes (GNB), and Multi-layer Feedforward Deep Neural Network (MLFDNN). The random oversampling technique is applied to deal with the problem of data imbalance associated with the injury severity analysis. The model training and testing are completed using a dataset containing records of 10,155 crashes that occurred on two interstate highways in New Jersey over a period of two years. The second part of the study analyzes the potential improvement in the prediction abilities of the proposed models by adding reactive data (such as vehicle characteristics and driver characteristics) to the analysis. Commonly, the reactive data is only available (known) after the crash occurs. In the proposed research, the crash analysis is performed by classifying crashes in multiple groupings (instead of a single group), constructed based on the age of drivers and vehicles to account for the impact of reactive data on driver injury severity outcomes. The results of the second part of the study show that while the simultaneous use of reactive and proactive data can improve the prediction performance of the models, the absolute crash probability values must be further improved for operational crash risk prediction. To this end, in the third part of the study, the Naturalistic Driving Study data is used to calibrate the crash risk models, including the driver behavior risk factors. The findings show significant improvement in crash prediction accuracy with the inclusion of driver behavior risk factors, which confirms the driver behavior to be the most critical risk factor affecting the crash likelihood and the associated injury severity

Digital Commons @ New Jersey Institute of Technology (NJIT)

Evaluation of machine learning algorithms as predictive tools in road safety analysis

Author: Tayebikhorami Saeid
Publication venue: 'University of Saskatchewan Library'
Publication date: 01/04/2022
Field of study

The Highway Safety Manual (HSM)’s road safety management process (RSMP) represents the state-of-the-practice procedure that transportation professionals employ to monitor and improve safety on existing roadway sites. RSMP requires the development of safety performance functions (SPFs), which are the key regression tools in the Highway Safety Manual’s RSMP used to predict crash frequency given a set of roadway and traffic factors. Although developing SPFs using traditional regression modeling have been proven to be reliable tools for road safety predictive analytics, some limitations and constraints have been highlighted in the literature, such as the assumption of a probability distribution, selection of a pre-defined functional form, a possible correlation between independent variables, and possible transferability issues. An alternative to traditional regression models as predictive tools is the use of Machine Learning (ML) algorithms. Although ML provides a new modeling technique, it still has made-in assumptions and their performance in collision frequency modeling needs to be studied. This research 1) compares the prediction performance of three well-known ML algorithms, i.e., Support Vector Machine (SVM), Decision Tree (DT), and Random Forest (RF), to traditional SPFs, 2) conducts sensitivity analysis and compare ML with the functional form of the negative binomial (NB) model as default traditional regression modeling technique, and 3) applies and validates ML algorithms in network screening (hotspot identification), which is the first step in the RSMP. To achieve these objectives, a dataset of urban signalized and unsignalized intersections from two major municipalities in Saskatchewan (Canada) were considered as a case study. The results showed that the ML prediction accuracies are comparable with that of the NB model. Moreover, the sensitivity analysis proved that ML algorithms predictions are mostly affected by changes in traffic volume, rather than other roadway factors. Lastly, the ML-based measure consistency in identifying hotspots appeared to be comparable to SPF-based measures, e.g., the excess (predicted and expected) average crash frequency. Overall, the results of this research support the use of ML as a predictive tool in network screening, which provides transportation practitioners with an alternative modeling approach to identify collision-prone locations where countermeasures aimed at reducing collision frequency at urban intersections can be installed

University of Saskatchewan Research Archive

Machine Learning Applications to Predict Road Crash and Soccer Game Outcomes

Author: Bai Lu
Publication venue: Digital Commons @ New Haven
Publication date: 01/12/2019
Field of study

Machine learning has become a cutting-edge and widely studied data science field of study in recent years across many industries and disciplines. In this thesis, two problems (1- crash severity prediction, 2- soccer game outcome prediction.) were investigated by using a set of machine learning approaches, namely: Ridge regression, Lasso Regression, Support Vector Machine (SVM), Neural Network (NN), Random Forest (RF). The first study is focused on investigating the critical factors affecting crash severity on a comprehensive time-series state-wide traffic crash data. The dataset covers crashes occurred in the state of Connecticut between 1995 and 2014. Traffic crashes are an increasing cause of death and injury in the world. The overall purposes of the first study were to propose, develop, and implement machine learning approaches in predicting the severity levels of human beings involved in the crashes and investigating the important crash predictors contributing to the injury severity. The predictor variables included road and vehicle conditions, characteristics of drivers and passengers, and environmental conditions. Results indicate that RF provided the best prediction accuracy of 73.85% in correctly classifying a crash based on its severity: fatal, injury, or property damage only. In addition to the overall comparison of proposed machine learning approaches in terms of accuracy, the prediction results were combined with the economic loss of each severity level to provide managerial insights on estimating the financial consequences of traffic crashes. RF provided the importance of each predictor in affecting the severity levels of involved human beings. The ejection status of the driver or passenger was found to be as the most crucial factor leading to the most severe injuries. Besides, a time series analysis of the 20-years crash data was conducted. The analysis results demonstrated that the prediction accuracy of RF increased with period, and the importance of some predictors also changed. From the perspective of policy making, strict inspection on drunk driving and drug use could lead to substantial road safety improvement. Ejection status is the essential risk factors that affect fatal and incapacitating severity level. The use of seat belts significantly reduces the risk of passengers being ejected out of the vehicle when the crash occurred. In the second study, recent five-season game data of three major leagues were scraped from whoscore.com. The Leagues were two top European leagues, Spanish La Liga, English Premier League (EPL), and one US League, Major League Soccer (MLS). The purpose of the study was to develop a statistically credible machine learning approaches to predict a soccer game outcome and investigate the significance of predictors (game statistics). Different from previous closely-related studies, the proposed machine learning models were not only applied to the combined dataset of the three leagues but also were studied separately on each league to compare the prediction performance and important predictors. The best prediction performance was achieved by NN with an accuracy of 85.71% (+/- 0.73%) of the combined dataset. For each league, RF had the best performance. RF also provided the importance of each predictor. The results presented that the home-field advantage was more evident in the MLS games than in the other two Europe leagues. The home team or away team factor was the most critical predictor that affected the MLS games. Although it was also an important predictor for La Liga and EPL games, the most influential predictor was the difference in the number of shots on target between the home team and away team. For the three leagues, the number of crosses was the most significant pass type, and the difference in the rate of card per foul was the most crucial card situation. The referee primarily determines the difference in the rate of card per foul. For the Europe leagues, the difference in the number of counter attacks and open plays were consequential attempt types affecting a game result in La Liga and EPL, while in the MLS, the difference in the number of set-piece was the most crucial predictor variable. Overall, the results of the two studies indicated that the proposed machine learning approaches yielded effective prediction performance for crash severity and soccer outcomes’ prediction. RF had slightly superior prediction performance among the five machine learning models for both studies. Even though the two problem domains were from different industries or policy making area, the proposed machine learning approaches effectively dealt with the complexity of the data in terms of dimensionality and time-series nature

Digital Commons @ New Haven

Work Zone Safety Analysis, Investigating Benefits from Accelerated Bridge Construction (ABC) on Roadway Safety

Author: Mokhtarimousavi Seyedmirsajad
Publication venue: FIU Digital Commons
Publication date: 06/10/2020
Field of study

The attributes of work zones have significant impacts on the risk of crash occurrence. Therefore, identifying the factors associated with crash severity and frequency in work zone locations is of important value to roadway safety. In addition, the significant loss of workers’ lives and injuries resulting from work zone crashes indicates the emergent need for a comprehensive and in-depth investigation of work zone crash mechanisms. The cost of work zone crashes is another issue that should be taken into account as work zone crashes impose millions of dollars on society each year. Applying innovative construction methods like Accelerated Bridge Construction (ABC) dramatically decreases on-site construction duration and thus improves roadway safety. This safe and cost-effective procedure for building new bridges or replacing/rehabilitating existing bridges in just a few weeks instead of months or years may prevent crashes and avoid injuries as a result of work zone presence. The application of machine learning techniques in traffic safety studies has seen explosive growth in recent years. Compared to statistical methods, MLs are more accurate prediction models due to their ability to deal with more complex functions. To this end, this study focuses on three major areas: crash severity at construction work zones with worker presence, crash frequency at bridge locations, and assessment of the associated costs to calculate the contribution of safety to the benefit-cost ratio of ABC as compared to conventional methods. Some key findings of this study can be highlighted as in-depth investigation of contributing factors in conjunction with the results from statistical and machine learning models, which can provide a more comprehensive interpretation of crash severity/frequency outcomes. The demonstration of work zone crashes needs to be modeled separately by time of day for severity analysis with a high level of confidence. Investigation of the contributing factors revealed the nonlinear relationship between crash severity/frequency and contributing factors. Finally, the results showed that the safety benefits from a case study in Florida consisted of 43% of the total ABC implementation cost. This indicates that the safety benefits of ABC implementation consist of a considerable portion of its benefit-cost ratio

DigitalCommons@Florida International University

DATA-DRIVEN BAYESIAN METHOD-BASED TRAFFIC CRASH DRIVER INJURY SEVERITY FORMULATION, ANALYSIS, AND INFERENCE

Author: Chen Cong
Publication venue: UNM Digital Repository
Publication date: 01/02/2016
Field of study

Traffic crashes have resulted in significant cost to society in terms of life and economic losses, and comprehensive examination of crash injury outcome patterns is of practical importance. By inferring the parameters of interest from prior information and studied datasets, Bayesian models are efficient methods in data analysis with more accurate results, but their applications in traffic safety studies are still limited. By examining the driver injury severity patterns, this research is proposed to systematically examine the applicability of Bayesian methods in traffic crash driver injury severity prediction in traffic crashes. In this study, three types of Bayesian models are defined: hierarchical Bayesian regression model, Bayesian non-regression model and knowledge-based Bayesian non-parametric model, and a conceptual framework is developed for selecting the appropriate Bayesian model based on discrete research purposes. Five Bayesian models are applied accordingly to test their effectiveness in traffic crash driver injury severity prediction and variable impact estimation: hierarchical Bayesian binary logit model, hierarchical Bayesian ordered logit model, hierarchical Bayesian random intercept model with cross-level interactions, multinomial logit (MNL)-Bayesian Network (BN) model, and decision table/na\xefve Bayes (DTNB) model. A complete dataset containing all crashes occurring on New Mexico roadways in 2010 and 2011 is used for model analyses. The studied dataset is composed of three major sub-datasets: crash dataset, vehicle dataset and driver dataset, and all included variables are therefore divided into two hierarchical levels accordingly: crash-level variables and vehicle/driver variables. From all these five models, the model performance and analysis results have shown promising performance on injury severity prediction and variable influence analysis, and these results underscore the heterogeneous impacts of these significant variables on driver injury severity outcomes. The performances of these models are also compared among these methods or with traditional traffic safety models. With the analyzed results, tentative suggestions regarding countermeasures and further research efforts to reduce crash injury severity are proposed. The research results enhance the understandings of the applicability of Bayesian methods in traffic safety analysis and the mechanisms of crash injury severity outcomes, and provide beneficial inference to improve safety performance of the transportation system