7 research outputs found
Crash data quality for road safety research: current state and future directions
Crash databases are one of the primary data sources for road safety research. Therefore, their quality is fundamental for the accuracy of crash analyses and, consequently the design of effective countermeasures. Although crash data often suffer from correctness and completeness issues, these are rarely discussed or addressed in crash analyses. Crash reports aim to answer the five “W” questions (i.e. When?, Where?, What?, Who? and Why?) of each crash by including a range of attributes. This paper reviews current literature on the state of crash data quality for each of these questions separately. The most serious data quality issues appear to be: inaccuracies in crash location and time, difficulties in data linkage (e.g. with traffic data) due to inconsistencies in databases, severity misclassification, inaccuracies and incompleteness of involved users’ demographics and inaccurate identification of crash contributory factors. It is shown that the extent and the severity of data quality issues are not equal between attributes and the level of impact in road safety analyses is not yet entirely known. This paper highlights areas that require further research and provides some suggestions for the development of intelligent crash reporting systems
Multilevel logistic regression modelling for crash mapping in metropolitan areas
The spatial nature of traffic crashes makes crash locations one of the most important and informative attributes of crash databases. It is however very likely that recorded crash locations in terms of easting and northing coordinates, distances from junctions, addresses, road names and types are inaccurately reported. Improving the quality of crash locations therefore has the potential to enhance the accuracy of many spatial crash analyses. The determination of correct crash locations usually requires a combination of crash and network attributes with suitable crash mapping methods. Urban road networks are more sensitive to erroneous matches due to high road density and inherent complexity. This paper presents a novel crash mapping method suitable for urban and metropolitan areas that matched all the crashes that occurred in London from 2010-2012. The method is based on a hierarchical data structure of crashes (i.e. candidate road links are nested within vehicles and vehicles nested within crashes) and employs a multilevel logistic regression model to estimate the probability distribution of mapping a crash onto a set of candidate road links. The road link with the highest probability is considered to be the correct segment for mapping the crash. This is based on the two primary variables: (a) the distance between the crash location and a candidate segment and (b) the difference between the vehicle direction just before the collision and the link direction. Despite the fact that road names were not considered due to limited availability of this variable in the applied crash database, the developed method provides a 97.1% (±1%) accurate matches (N=1,000). The method was compared with two simpler, non-probabilistic crash mapping algorithms and the results were used to demonstrate the effect of crash location data quality on a crash risk analysis
Predicting the safety impact of a speed limit increase using condition-based multivariate Poisson lognormal regression
Speed limit changes are considered to lead to proportional changes in the number and severity of crashes. To predict the impact of a speed limit alteration, it is necessary to define a relationship between crashes and speed on a road network. This paper examines the relationship of crashes with speed, as well as with other traffic and geometric variables, on the UK motorways in order to estimate the impact of a potential speed limit increase from 70 mph to 80 mph on traffic safety. Full Bayesian multivariate Poisson lognormal regression models are applied to a dataset aggregated using the condition-based approach for crashes by vehicle (i.e. single-vehicle and multiple-vehicle) and severity (i.e. fatal or serious and slight). The results show that single-vehicle crashes of all severities and fatal or serious injury crashes involving multiple vehicles increase at higher speed conditions and particularly when these are combined with lower volumes. Slight injury multiple-vehicle crashes are found not to be related with high speeds, but instead with congested traffic. Using the speed elasticity values derived from the models the predicted annual increase in crashes after a speed limit increase on the UK motorway is found to be 6.2-12.1 % for fatal or serious injury crashes and 1.3-2.7% for slight injury, or else up to 167 more crashes
Methodological evolution and frontiers of identifying, modeling and preventing secondary crashes on highways
© 2018 Elsevier Ltd Secondary crashes (SCs) or crashes that occur within the boundaries of the impact area of prior, primary crashes are one of the incident types that frequently affect highway traffic operations and safety. Existing studies have made great efforts to explore the underlying mechanisms of SCs and relevant methodologies have been e volving over the last two decades concerning the identification, modeling, and prevention of these crashes. So far there is a lack of a detailed examination on the progress, lessons, and potential opportunities regarding existing achievements in SC-related studies. This paper provides a comprehensive investigation of the state-of-the-art approaches; examines their strengths and weaknesses; and provides guidance in exploiting new directions in SC-related research. It aims to support researchers and practitioners in understanding well-established approaches so as to further explore the frontiers. Published studies focused on SCs since 1997 have been identified, reviewed, and summarized. Key issues concentrated on the following aspects are discussed: (i) static/dynamic approaches to identify SCs; (ii) parametric/non-parametric models to analyze SC risk, and (iii) deployable countermeasures to prevent SCs. Based on the examined issues, needs, and challenges, this paper further provides insights into potential opportunities such as: (a) fusing data from multiple sources for SC identification, (b) using advanced learning algorithms for real-time SC analysis, and (c) deploying connected vehicles for SC prevention in future research. This paper contributes to the research community by providing a one-stop reference for research on secondary crashes
Exploring crash-risk factors using Bayes’ theorem and an optimization routine
Regression models used to analyse crash counts are associated with some kinds of data aggregation (either spatial, or temporal or both) that may result in inconsistent or incorrect outcomes. This paper introduces a new non-regression approach for analysing risk factors affecting crash counts without aggregating crashes. The method is an application of the Bayes’ Theorem that enables to compare the
distribution of the prevailing traffic conditions on a road network (i.e. a priori) with the distribution of traffic conditions just before crashes (i.e. a posteriori). By making use of Bayes’ Theorem, the
probability densities of continuous explanatory variables are estimated using kernel density estimation
and a posterior log likelihood is maximised by an optimisation routine (Maximum Likelihood Estimation). The method then estimates the parameters that define the crash risk that is associated with each of the examined crash contributory factors. Both simulated and real-world data were
employed to demonstrate and validate the developed theory in which, for example, two explanatory traffic variables speed and volume were employed. Posterior kernel densities of speed and volume at the location and time of crashes have found to be different that prior kernel densities of the same variables. The findings are logical as higher traffic volumes increase the risk of all crashes independently of collision type, severity and time of occurrence. Higher speeds were found to decrease the risk of multiple-vehicle crashes at peak-times and not to affect significantly multiple vehicle crash occurrences during off-peak times. However, the risk of single vehicle crashes always increases while speed increases
Re-visiting crash-speed relationships: a new perspective in crash modelling
Although speed is considered to be one of the main crash contributory factors, research findings are inconsistent. Independent of the robustness of their statistical approaches, crash frequency models typically employ crash data that are aggregated using spatial criteria (e.g., crash counts by link termed as a link-based approach). In this approach, the variability in crashes between links is explained by highly aggregated average measures that may be inappropriate, especially for time-varying variables such as speed and volume. This paper re-examines crash-speed relationships by creating a new crash data aggregation approach that enables improved representation of the road conditions just before crash occurrences. Crashes are aggregated according to the similarity of their pre-crash traffic and geometric conditions, forming an alternative crash count dataset termed as a condition-based approach. Crash-speed relationships are separately developed and compared for both approaches by employing the annual crashes that occurred on the Strategic Road Network of England in 2012. The datasets are modelled by injury severity using multivariate Poisson lognormal regression, with multivariate spatial effects for the link-based model, using a full Bayesian inference approach. The results of the condition-based approach show that high speeds trigger crash frequency. The outcome of the link-based model is the opposite; suggesting that the speed-crash relationship is negative regardless of crash severity. The differences between the results imply that data aggregation is a crucial, yet so far overlooked, methodological element of crash data analyses that may have direct impact on the modelling outcomes
Re-visiting crash-speed relationships: a new perspective in crash modelling
Although speed is considered to be one of the main crash contributory factors, research findings are inconsistent. Independent of the robustness of their statistical approaches, crash frequency models typically employ crash data that are aggregated using spatial criteria (e.g., crash counts by link termed as a link-based approach). In this approach, the variability in crashes between links is explained by highly aggregated average measures that may be inappropriate, especially for time-varying variables such as speed and volume. This paper re-examines crash-speed relationships by creating a new crash data aggregation approach that enables improved representation of the road conditions just before crash occurrences. Crashes are aggregated according to the similarity of their pre-crash traffic and geometric conditions, forming an alternative crash count dataset termed as a condition-based approach. Crash-speed relationships are separately developed and compared for both approaches by employing the annual crashes that occurred on the Strategic Road Network of England in 2012. The datasets are modelled by injury severity using multivariate Poisson lognormal regression, with multivariate spatial effects for the link-based model, using a full Bayesian inference approach. The results of the condition-based approach show that high speeds trigger crash frequency. The outcome of the link-based model is the opposite; suggesting that the speed-crash relationship is negative regardless of crash severity. The differences between the results imply that data aggregation is a crucial, yet so far overlooked, methodological element of crash data analyses that may have direct impact on the modelling outcomes