7 research outputs found

    Anomaly detection and classification in traffic flow data from fluctuations in the flow-density relationship

    Get PDF
    We describe and validate a novel data-driven approach to the real time detection and classification of traffic anomalies based on the identification of atypical fluctuations in the relationship between density and flow. For aggregated data under stationary conditions, flow and density are related by the fundamental diagram. However, high resolution data obtained from modern sensor networks is generally non-stationary and disaggregated. Such data consequently show significant statistical fluctuations. These fluctuations are best described using a bivariate probability distribution in the density-flow plane. By applying kernel density estimation to high-volume data from the UK National Traffic Information Service (NTIS), we empirically construct these distributions for London's M25 motorway. Curves in the density-flow plane are then constructed, analogous to quantiles of univariate distributions. These curves quantitatively separate atypical fluctuations from typical traffic states. Although the algorithm identifies anomalies in general rather than specific events, we find that fluctuations outside the 95\% probability curve correlate strongly with the spikes in travel time associated with significant congestion events. Moreover, the size of an excursion from the typical region provides a simple, real-time measure of the severity of detected anomalies. We validate the algorithm by benchmarking its ability to identify labelled events in historical NTIS data against some commonly used methods from the literature. Detection rate, time-to-detect and false alarm rate are used as metrics and found to be generally comparable except in situations when the speed distribution is bi-modal. In such situations, the new algorithm achieves a much lower false alarm rate without suffering significant degradation on the other metrics. This method has the additional advantage of being self-calibrating.Comment: 23 pages, 12 figure

    Data science perspectives on problems in intelligent transportation systems and mobility

    Get PDF
    This thesis is a body of work applying data science and mathematical modelling to problems in intelligent transportation systems. Utilising data collected from the M25 London orbital, four problems relevant to both industry and academia are considered. In chapter 4 we develop a novel methodology for anomaly detection on road networks. We determine a data-driven region of typical behaviour in the flow-density plane, tracking fluctuations from this to identify anomalies in real time. We find this offers generally comparable performance to existing methods, but is clearly superior when the distribution of speeds conditioned on time of week is bimodal. In chapter 5, we quantify the prevalence of primary and secondary traffic incidents in our data using a novel self-exciting point process. The selfexcitation component suggests 6-7% of incidents are most likely secondary, occurring temporally and spatially in the wake of other incidents. Our modelling further identifies two spatial hotspots and captures commuting patterns in the UK. We are able to apply out-of-sample validation and show the model is statistically defensible. Chapter 6 explores dynamic prediction of incident durations. We find non-parametric neural network models offer strong performance compared to a range of alternative candidates, achieving errors below current industry targets. By exploring feature importance, we find time series prove informative for predictions on short horizons whereas time of day and location do so at longer horizons. We explore an emergent behaviour path planning model in the context of autonomous vehicles in chapter 7. This was developed in conjunction with engineers from Jaguar Land Rover and incorporates practical constraints realworld vehicles must satisfy. Formulating an optimization problem incorporating comfort, safety and progress, we show dynamically solving this results in emergent complex driving behaviours: vehicle following, passing and overtaking. Safety is based on a distributional prediction of drivers behaviours, with its variance indirectly defining properties of the emergent behaviours. Our findings throughout this work offer models and methodologies that can be used to improve the management and better understand the behaviour of existing transportation infrastructure, as well as the development of future technologies

    Dynamic and Interpretable Hazard-Based Models of Traffic Incident Durations

    Get PDF
    Understanding and predicting the duration or “return-to-normal” time of traffic incidents is important for system-level management and optimization of road transportation networks. Increasing real-time availability of multiple data sources characterizing the state of urban traffic networks, together with advances in machine learning offer the opportunity for new and improved approaches to this problem that go beyond static statistical analyses of incident duration. In this paper we consider two such improvements: dynamic update of incident duration predictions as new information about incidents becomes available and automated interpretation of the factors responsible for these predictions. For our use case, we take one year of incident data and traffic state time-series data from the M25 motorway in London. We use it to train models that predict the probability distribution of incident durations, utilizing both time-invariant and time-varying features of the data. The latter allow predictions to be updated as an incident progresses, and more information becomes available. For dynamic predictions, time-series features are fed into the Match-Net algorithm, a temporal convolutional hitting-time network, recently developed for dynamical survival analysis in clinical applications. The predictions are benchmarked against static regression models for survival analysis and against an established dynamic technique known as landmarking and found to perform favourably by several standard comparison measures. To provide interpretability, we utilize the concept of Shapley values recently developed in the domain of interpretable artificial intelligence to rank the features most relevant to the model predictions at different time horizons. For example, the time of day is always a significantly influential time-invariant feature, whereas the time-series features strongly influence predictions at 5 and 60-min horizons. Although we focus here on traffic incidents, the methodology we describe can be applied to many survival analysis problems where time-series data is to be combined with time-invariant features

    A non-parametric Hawkes process model of primary and secondary accidents on a UK smart motorway

    Get PDF
    A self-exciting spatio-temporal point process is fitted to incident data from the UK National Traffic Information Service to model the rates of primary and secondary ac- cidents on the M25 motorway in a 12-month period during 2017-18. This process uses a background component to represent primary accidents, and a self-exciting component to represent secondary accidents. The background consists of periodic daily and weekly components, a spatial component and a long-term trend. The self-exciting components are decaying, unidirectional functions of space and time. These components are de- termined via kernel smoothing and likelihood estimation. Temporally, the background is stable across seasons with a daily double peak structure reflecting commuting patterns. Spatially, there are two peaks in intensity, one of which becomes more pronounced dur- ing the study period. Self-excitation accounts for 6-7% of the data with associated time and length scales around 100 minutes and 1 kilometre respectively. In-sample and out- of-sample validation are performed to assess the model fit. When we restrict the data to incidents that resulted in large speed drops on the network, the results remain coherent

    Anomaly detection and classification in traffic flow data from fluctuations in the flow–density relationship

    No full text
    We describe and validate a novel data-driven approach to the real time detection and classification of traffic anomalies based on the identification of atypical fluctuations in the relationship between density and flow. For aggregated data under stationary conditions, flow and density are related by the fundamental diagram. However, high resolution data obtained from modern sensor networks is generally non-stationary and disaggregated. Such data consequently show significant statistical fluctuations. These fluctuations are best described using a bivariate probability distribution in the density–flow plane. By applying kernel density estimation to high-volume data from the UK National Traffic Information Service (NTIS), we empirically construct these distributions for London’s M25 motorway. Curves in the density–flow plane are then constructed, analogous to quantiles of univariate distributions. These curves quantitatively separate atypical fluctuations from typical traffic states. Although the algorithm identifies anomalies in general rather than specific events, we find that fluctuations outside the 95% probability curve correlate strongly with the spikes in travel time associated with significant congestion events. Moreover, the size of an excursion from the typical region provides a simple, real-time measure of the severity of detected anomalies. We validate the algorithm by benchmarking its ability to identify labelled events in historical NTIS data against some commonly used methods from the literature. Detection rate, time-to-detect and false alarm rate are used as metrics and found to be generally comparable except in situations when the speed distribution is bi-modal. In such situations, the new algorithm achieves a much lower false alarm rate without suffering significant degradation on the other metrics. This method has the additional advantages of being self-calibrating and adaptive: the curve marking atypical behaviour is different for each section of road and can evolve in time as the data changes, for example, due to long-term roadworks
    corecore