12,853 research outputs found
Improving adaptation and interpretability of a short-term traffic forecasting system
Traffic management is being more important than ever, especially in overcrowded big cities with over-pollution problems and with new unprecedented mobility changes. In this scenario, road-traffic prediction plays a key role within Intelligent Transportation Systems, allowing traffic managers to be able to anticipate and take the proper decisions. This paper aims to analyse the situation in a commercial real-time prediction system with its current problems and limitations. The analysis unveils the trade-off between simple parsimonious models and more complex models. Finally, we propose an enriched machine learning framework, Adarules, for the traffic prediction in real-time facing the problem as continuously incoming data streams with all the commonly occurring problems in such volatile scenario, namely changes in the network infrastructure and demand, new detection stations or failure ones, among others. The framework is also able to infer automatically the most relevant features to our end-task, including the relationships within the road network. Although the intention with the proposed framework is to evolve and grow with new incoming big data, however there is no limitation in starting to use it without any prior knowledge as it can starts learning the structure and parameters automatically from data. We test this predictive system in different real-work scenarios, and evaluate its performance integrating a multi-task learning paradigm for the sake of the traffic prediction task.Peer ReviewedPostprint (published version
HYPA: Efficient Detection of Path Anomalies in Time Series Data on Networks
The unsupervised detection of anomalies in time series data has important
applications in user behavioral modeling, fraud detection, and cybersecurity.
Anomaly detection has, in fact, been extensively studied in categorical
sequences. However, we often have access to time series data that represent
paths through networks. Examples include transaction sequences in financial
networks, click streams of users in networks of cross-referenced documents, or
travel itineraries in transportation networks. To reliably detect anomalies, we
must account for the fact that such data contain a large number of independent
observations of paths constrained by a graph topology. Moreover, the
heterogeneity of real systems rules out frequency-based anomaly detection
techniques, which do not account for highly skewed edge and degree statistics.
To address this problem, we introduce HYPA, a novel framework for the
unsupervised detection of anomalies in large corpora of variable-length
temporal paths in a graph. HYPA provides an efficient analytical method to
detect paths with anomalous frequencies that result from nodes being traversed
in unexpected chronological order.Comment: 11 pages with 8 figures and supplementary material. To appear at SIAM
Data Mining (SDM 2020
A taxonomy framework for unsupervised outlier detection techniques for multi-type data sets
The term "outlier" can generally be defined as an observation that is significantly different from
the other values in a data set. The outliers may be instances of error or indicate events. The
task of outlier detection aims at identifying such outliers in order to improve the analysis of
data and further discover interesting and useful knowledge about unusual events within numerous
applications domains. In this paper, we report on contemporary unsupervised outlier detection
techniques for multiple types of data sets and provide a comprehensive taxonomy framework and
two decision trees to select the most suitable technique based on data set. Furthermore, we
highlight the advantages, disadvantages and performance issues of each class of outlier detection
techniques under this taxonomy framework
- …