10,174 research outputs found
Multi-Output Gaussian Processes for Crowdsourced Traffic Data Imputation
Traffic speed data imputation is a fundamental challenge for data-driven
transport analysis. In recent years, with the ubiquity of GPS-enabled devices
and the widespread use of crowdsourcing alternatives for the collection of
traffic data, transportation professionals increasingly look to such
user-generated data for many analysis, planning, and decision support
applications. However, due to the mechanics of the data collection process,
crowdsourced traffic data such as probe-vehicle data is highly prone to missing
observations, making accurate imputation crucial for the success of any
application that makes use of that type of data. In this article, we propose
the use of multi-output Gaussian processes (GPs) to model the complex spatial
and temporal patterns in crowdsourced traffic data. While the Bayesian
nonparametric formalism of GPs allows us to model observation uncertainty, the
multi-output extension based on convolution processes effectively enables us to
capture complex spatial dependencies between nearby road segments. Using 6
months of crowdsourced traffic speed data or "probe vehicle data" for several
locations in Copenhagen, the proposed approach is empirically shown to
significantly outperform popular state-of-the-art imputation methods.Comment: 10 pages, IEEE Transactions on Intelligent Transportation Systems,
201
Matrix Completion With Variational Graph Autoencoders: Application in Hyperlocal Air Quality Inference
Inferring air quality from a limited number of observations is an essential
task for monitoring and controlling air pollution. Existing inference methods
typically use low spatial resolution data collected by fixed monitoring
stations and infer the concentration of air pollutants using additional types
of data, e.g., meteorological and traffic information. In this work, we focus
on street-level air quality inference by utilizing data collected by mobile
stations. We formulate air quality inference in this setting as a graph-based
matrix completion problem and propose a novel variational model based on graph
convolutional autoencoders. Our model captures effectively the spatio-temporal
correlation of the measurements and does not depend on the availability of
additional information apart from the street-network topology. Experiments on a
real air quality dataset, collected with mobile stations, shows that the
proposed model outperforms state-of-the-art approaches
Arriving on time: estimating travel time distributions on large-scale road networks
Most optimal routing problems focus on minimizing travel time or distance
traveled. Oftentimes, a more useful objective is to maximize the probability of
on-time arrival, which requires statistical distributions of travel times,
rather than just mean values. We propose a method to estimate travel time
distributions on large-scale road networks, using probe vehicle data collected
from GPS. We present a framework that works with large input of data, and
scales linearly with the size of the network. Leveraging the planar topology of
the graph, the method computes efficiently the time correlations between
neighboring streets. First, raw probe vehicle traces are compressed into pairs
of travel times and number of stops for each traversed road segment using a
`stop-and-go' algorithm developed for this work. The compressed data is then
used as input for training a path travel time model, which couples a Markov
model along with a Gaussian Markov random field. Finally, scalable inference
algorithms are developed for obtaining path travel time distributions from the
composite MM-GMRF model. We illustrate the accuracy and scalability of our
model on a 505,000 road link network spanning the San Francisco Bay Area
Targeted matrix completion
Matrix completion is a problem that arises in many data-analysis settings
where the input consists of a partially-observed matrix (e.g., recommender
systems, traffic matrix analysis etc.). Classical approaches to matrix
completion assume that the input partially-observed matrix is low rank. The
success of these methods depends on the number of observed entries and the rank
of the matrix; the larger the rank, the more entries need to be observed in
order to accurately complete the matrix. In this paper, we deal with matrices
that are not necessarily low rank themselves, but rather they contain low-rank
submatrices. We propose Targeted, which is a general framework for completing
such matrices. In this framework, we first extract the low-rank submatrices and
then apply a matrix-completion algorithm to these low-rank submatrices as well
as the remainder matrix separately. Although for the completion itself we use
state-of-the-art completion methods, our results demonstrate that Targeted
achieves significantly smaller reconstruction errors than other classical
matrix-completion methods. One of the key technical contributions of the paper
lies in the identification of the low-rank submatrices from the input
partially-observed matrices.Comment: Proceedings of the 2017 SIAM International Conference on Data Mining
(SDM
- …