9,377 research outputs found

    Decision Support for Road Safety: Development of Key Performance Indicators for Police Analysts

    Get PDF
    In 2017, five out of 100,000 people were killed by road accidents in Europe. In order to reduce this number with appropriate measures, the police nowadays manually defines combinations of accident attributes (e. g., accidents on slippery road surfaces at night), which then form the basis for tracking the number of accidents over time. The aim of this paper is to combine the following data analysis approaches in order to detect interesting attribute combinations, also referred to as “itemsets”, relevant for current and future observations. The resulting combinations are proposed to the police as new key performance indicators and can also be used directly for planning police measures to increase road safety. A four-stage decision support system is introduced that employs frequent itemset mining in the first stage. The temporal aspect of traffic accident data is illustrated by time series containing, for each itemset, the relative frequencies of accidents with the corresponding attribute combination. In the second step, the time series are grouped according to their shape by time series clustering and classification. In the third step, we determine the optimal forecasting method for each generated cluster of time series. Based on the prediction of future frequencies, we identify the most interesting attribute combinations in the last step. These are displayed geographically so that a police analyst can easily identify current and developing hot spots

    Data Mining Approach of Accident Occurrences Identification with Effective Methodology and Implementation

    Get PDF
    Data mining is used in various domains of research to identify a new cause for tan effect in the society over the globe. This article includes the same reason for using the data mining to identify the Accident Occurrences in different regions and to identify the most valid reason for happening accidents over the globe. Data Mining and Advanced Machine Learning algorithms are used in this research approach and this article discusses about hyperline, classifications, pre-processing of the data, training the machine with the sample datasets which are collected from different regions in which we have structural and semi-structural data. We will dive into deep of machine learning and data mining classification algorithms to find or predict something novel about the accident occurrences over the globe. We majorly concentrate on two classification algorithms to minify the research and task and they are very basic and important classification algorithms. SVM (Support vector machine), CNB Classifier. This discussion will be quite interesting with WEKA tool for CNB classifier, Bag of Words Identification, Word Count and Frequency Calculation

    Temporospatial Context-Aware Vehicular Crash Risk Prediction

    Get PDF
    With the demand for more vehicles increasing, road safety is becoming a growing concern. Traffic collisions take many lives and cost billions of dollars in losses. This explains the growing interest of governments, academic institutions and companies in road safety. The vastness and availability of road accident data has provided new opportunities for gaining a better understanding of accident risk factors and for developing more effective accident prediction and prevention regimes. Much of the empirical research on road safety and accident analysis utilizes statistical models which capture limited aspects of crashes. On the other hand, data mining has recently gained interest as a reliable approach for investigating road-accident data and for providing predictive insights. While some risk factors contribute more frequently in the occurrence of a road accident, the importance of driver behavior, temporospatial factors, and real-time traffic dynamics have been underestimated. This study proposes a framework for predicting crash risk based on historical accident data. The proposed framework incorporates machine learning and data analytics techniques to identify driving patterns and other risk factors associated with potential vehicle crashes. These techniques include clustering, association rule mining, information fusion, and Bayesian networks. Swarm intelligence based association rule mining is employed to uncover the underlying relationships and dependencies in collision databases. Data segmentation methods are employed to eliminate the effect of dependent variables. Extracted rules can be used along with real-time mobility to predict crashes and their severity in real-time. The national collision database of Canada (NCDB) is used in this research to generate association rules with crash risk oriented subsequents, and to compare the performance of the swarm intelligence based approach with that of other association rule miners. Many industry-demanding datasets, including road-accident datasets, are deficient in descriptive factors. This is a significant barrier for uncovering meaningful risk factor relationships. To resolve this issue, this study proposes a knwoledgebase approximation framework to enhance the crash risk analysis by integrating pieces of evidence discovered from disparate datasets capturing different aspects of mobility. Dempster-Shafer theory is utilized as a key element of this knowledgebase approximation. This method can integrate association rules with acceptable accuracy under certain circumstances that are discussed in this thesis. The proposed framework is tested on the lymphography dataset and the road-accident database of the Great Britain. The derived insights are then used as the basis for constructing a Bayesian network that can estimate crash likelihood and risk levels so as to warn drivers and prevent accidents in real-time. This Bayesian network approach offers a way to implement a naturalistic driving analysis process for predicting traffic collision risk based on the findings from the data-driven model. A traffic incident detection and localization method is also proposed as a component of the risk analysis model. Detecting and localizing traffic incidents enables timely response to accidents and facilitates effective and efficient traffic flow management. The results obtained from the experimental work conducted on this component is indicative of the capability of our Dempster-Shafer data-fusion-based incident detection method in overcoming the challenges arising from erroneous and noisy sensor readings

    Evaluation of Parametric and Nonparametric Statistical Models in Wrong-way Driving Crash Severity Prediction

    Get PDF
    Wrong-way driving (WWD) crashes result in more fatalities per crash, involve more vehicles, and cause extended road closures compared to other types of crashes. Although crashes involving wrong-way drivers are relatively few, they often lead to fatalities and serious injuries. Researchers have been using parametric statistical models to identify factors that affect WWD crash severity. However, these parametric models are generally based on several assumptions, and the results could generate numerous errors and become questionable when these assumptions are violated. On the other hand, nonparametric methods such as data mining or machine learning techniques do not use a predetermined functional form, can address the correlation problem among independent variables, display results graphically, and simplify the potential complex relationship between the variables. The main objective of this research was to demonstrate the applicability of nonparametric statistical models in successfully identifying factors affecting traffic crash severity. To achieve this goal, the performance of parametric and nonparametric statistical models in WWD crash severity prediction was evaluated. The following parametric methods were evaluated: Logistic Regression (LR), Ridge Regression (RR), Least Absolute Shrinkage and Selection Operator (LASSO), Linear Discriminant Analysis (LDA), and Gaussian Naïve Bayes (GNB). The following nonparametric methods were evaluated: Random Forests (RF), Decision Trees (DT), and Support Vector Machine (SVM). The evaluation was based on sensitivity, specificity, and prediction accuracy. The research also demonstrated the applicability of nonparametric supervised learning algorithms on crash severity analysis by combining tree-based data mining techniques and marginal effect analysis to show the correlation between the response and the predictor variables. The analysis was based on 1,475 WWD crashes that occurred on arterial road networks from 2012-2016 in Florida. The results showed that nonparametric models provided better prediction accuracy on predicting serious injury compared to parametric models. By conducting prediction accuracy comparison, contributor variables’ marginal effect analysis, variable importance evaluation, and crash severity pattern recognition analysis, the nonparametric models have been demonstrated to be valid and proved to serve as an alternative tool in transportation safety studies. The results showed that head-on collisions, weekends, high-speed facilities, crashes involving vehicles entering from a driveway, dark-not lighted roadways, older drivers, and driver impairment are important factors that play a crucial role in WWD crash severity on non-limited access facilities. This information may assist researchers and safety engineers in identifying specific strategies to reduce the severity of WWD crashes on arterial streets. Besides unveiling the factors contributing to WWD crash severity and their relationship with each other, this research has demonstrated the potential of using data mining techniques in yielding results that are easily understandable and interpretable

    Fine-tuning Road Classification Models: Optimization Strategies for Deep Belief Networks in Transportation Big Data

    Get PDF
    Road traffic accidents are very essential for common people, consequential an estimated 1.2 million deaths and 50 million injuries all over the world every year. In this emerging world, the road accidents are among the principal reason of fatality and injury. The concern of traffic safety has heaved immense alarms across the manageable enhancement of contemporary traffic and transportation. The analysis on road traffic accident grounds can detect the major aspects quickly, professionally and afford instructional techniques to the prevention of traffic accidents and reduction of road traffic accident, which might significantly decrease personal victim by means of road traffic accidents. Data Mining techniques are used in the process of knowledge discovery for many domains’ problems. Feature Selection plays a vital role for a large number of datasets. In this paper, the classification of road accident in transportation domain was analyzed with the assistance of the proposed Intelligent classification technique. In this proposed technique, the DBN hidden layers weights are optimized by using evolutionary Genetic algorithm. This GA is utilized to enhance the classification accuracy by applying the hidden layers of Restricted Boltzmann Machine (RBM). The comparative results show that the proposed intelligent classifier gives the improved accuracy, specificity, precision, Sensitivity, F-Measure, and reduced false positive rate

    Exploratory analysis of injury severity under different levels of driving automation (SAE Level 2-5) using multi-source data

    Full text link
    Vehicles equipped with automated driving capabilities have shown the potential to improve safety and operations. Advanced driver assistance systems (ADAS) and automated driving systems (ADS) have been widely developed to support vehicular automation. Although the studies on the injury severity outcomes that involve automated driving systems are ongoing, there is limited research investigating the difference between injury severity outcomes of the ADAS and ADS vehicles using real-world crash data. To ensure comprehensive analysis, a multi-source dataset that includes the NHTSA crash database (752 cases), CA DMV crash reports (498 cases), and news outlet data (30 cases) is used. Two random parameters multinomial logit models with heterogeneity in the means and variances are estimated to gain a better understanding of the variables impacting the crash injury severity outcome for the ADAS (SAE Level 2) and ADS (SAE Levels 3-5) vehicles. We found that while 56 percent of crashes involving ADAS vehicles took place on a highway, 84 percent of crashes involving ADS took place in more urban settings. The model estimation results indicate that the weather indicators, traffic incident or work zone indicator, differences in the system sophistication that are captured by both manufacture year and high or low mileage, type of collision, as well as rear and front impact indicators all play a significant role in the crash injury severity. The results offer an exploratory assessment of the safety performance of the ADAS and ADS equipped vehicles in the real-world environment and can be used by the manufacturers and other stakeholder to dictate the direction of their deployment and usage

    Accident prediction using machine learning:analyzing weather conditions, and model performance

    Get PDF
    Abstract. The primary focus of this study was to investigate the impact of weather and road conditions on the severity of accidents and to determine the feasibility of machine learning models in accurately predicting the likelihood of such incidents. The research was centered on two key research questions. Firstly, the study examined the influence of weather and road conditions on accident severity and identified the most related factors contributing to accidents. We utilized an open-source accident dataset, which was preprocessed using techniques like variable selection, missing data elimination, and data balancing through the Synthetic Minority Over-sampling Technique (SMOTE). Chi-square statistical analysis was performed, suggesting that all weather-related variables are more or less associated with the severity of accidents. Visibility and temperature were found to be the most critical factors affecting the severity of road accidents. Hence, appropriate measures such as implementing effective fog dispersal systems, heatwave alerts, or improved road maintenance during extreme temperatures could help reduce accident severity. Secondly, the research evaluated the ability of machine learning models including decision trees, random forests, naive bayes, extreme gradient boost, and neural networks to predict accident likelihood. The models’ performance was gauged using metrics like accuracy, precision, recall, and F1 score. The Random Forest model emerged as the most reliable and accurate model for predicting accidents, with an overall accuracy of 98.53%. The Decision Tree model also showed high overall accuracy (95.33%), indicating its reliability. However, the Naive Bayes model showed the lowest accuracy (63.31%) and was deemed less reliable in this context. It is concluded that machine learning models can be effectively used to predict the likelihood of accidents, with models like Random Forest and Decision Tree proving the most effective. However, the effectiveness of each model may vary depending on the dataset and context, necessitating further testing and validation for real-world implementation. These findings not only provide insight into the factors affecting accident severity but also open a promising avenue in employing machine learning techniques for proactive accident prediction and mitigation. Future studies can aim to refine the models further and potentially integrate them into traffic management systems to enhance road safety

    Heuristic modelling of traffic accident characteristics

    Get PDF
    Due to the complex structure of observation based traffic accident data and the absence of an analytic model to define their characteristics, different models are required. Accident characteristics have been modeled for different road segments with two different methods: evolutionary data clustering method and resilient neural networks. In the first method, observation data was clustered using an evolutionary search-based clustering algorithm. The first method is based on determining whether observation based test data have the conditions of a possible death or injury accident based on the distance to the cluster centers obtained. The second one is a regression method that predicts whether an accident will cause death or injury according to observation based traffic data in test road segments by using resilient neural networks. Experiment results show that data analysis methods are very effective in determining the existence of the conditions that may cause accidents resulting in death or injury.No sponso


    Get PDF
    Traffic crashes have resulted in significant cost to society in terms of life and economic losses, and comprehensive examination of crash injury outcome patterns is of practical importance. By inferring the parameters of interest from prior information and studied datasets, Bayesian models are efficient methods in data analysis with more accurate results, but their applications in traffic safety studies are still limited. By examining the driver injury severity patterns, this research is proposed to systematically examine the applicability of Bayesian methods in traffic crash driver injury severity prediction in traffic crashes. In this study, three types of Bayesian models are defined: hierarchical Bayesian regression model, Bayesian non-regression model and knowledge-based Bayesian non-parametric model, and a conceptual framework is developed for selecting the appropriate Bayesian model based on discrete research purposes. Five Bayesian models are applied accordingly to test their effectiveness in traffic crash driver injury severity prediction and variable impact estimation: hierarchical Bayesian binary logit model, hierarchical Bayesian ordered logit model, hierarchical Bayesian random intercept model with cross-level interactions, multinomial logit (MNL)-Bayesian Network (BN) model, and decision table/na\xefve Bayes (DTNB) model. A complete dataset containing all crashes occurring on New Mexico roadways in 2010 and 2011 is used for model analyses. The studied dataset is composed of three major sub-datasets: crash dataset, vehicle dataset and driver dataset, and all included variables are therefore divided into two hierarchical levels accordingly: crash-level variables and vehicle/driver variables. From all these five models, the model performance and analysis results have shown promising performance on injury severity prediction and variable influence analysis, and these results underscore the heterogeneous impacts of these significant variables on driver injury severity outcomes. The performances of these models are also compared among these methods or with traditional traffic safety models. With the analyzed results, tentative suggestions regarding countermeasures and further research efforts to reduce crash injury severity are proposed. The research results enhance the understandings of the applicability of Bayesian methods in traffic safety analysis and the mechanisms of crash injury severity outcomes, and provide beneficial inference to improve safety performance of the transportation system