Analysis of safety impact of freeway designs using data mining techniques


The Highway Safety Manual (HSM) was developed by the Transportation Research Board (TRB) and published in 2010 to introduce a science-based technical approach for safety analysis. The HSM provides analytical methods to quantify the safety effects of decisions and treatments in planning, design, operation, and maintenance. One of the main parts in the HSM, Part D, contains crash modification factors (CMFs) for various treatments on roadway segments and at intersections. Generally, a CMF represents the overall safety performance of specific treatment. Although the first edition of HSM provides various CMFs for rural roadways and urban arterials, but there are a lack of CMFs for freeways in the HSM. According to the HSM, a CMF can be estimated by the observational before-after studies and cross-sectional method. Observational before-after studies are well known approaches for evaluating safety effectiveness and calculating CMFs of specific roadway treatments. Moreover, the cross-sectional method has been commonly applied to derive CMFs due to the ease with which data can be obtained compared to the before-after approaches. In order to estimate the CMF using the cross-sectional method, development of safety performance functions (SPFs) or crash prediction models (CPMs) is required. Due to its strength of accounting for over-dispersion, the generalized linear regression with negative binomial (NB) distribution has been widely used to develop SPFs. In the cross-sectional method, the CMF can be calculated from the coefficient of the variable associated with specific treatment of the SPF. However, the estimated CMFs from generalized linear regression model (GLM) cannot account for the nonlinear effect of the treatment since the coefficients in the GLM are assumed to be fixed. Therefore, the objective of this study is to evaluate the safety effects of multiple roadway cross-section design elements on freeways in development of CMFs using data mining approaches. In order to account for both nonlinear effects and interaction impacts between variables, two promising data mining techniques, the multivariate adaptive regression splines (MARS) and the generalized additive models (GAM), were applied in this study

Similar works