27 research outputs found
Heteroscedastic Gaussian processes for uncertainty modeling in large-scale crowdsourced traffic data
Accurately modeling traffic speeds is a fundamental part of efficient
intelligent transportation systems. Nowadays, with the widespread deployment of
GPS-enabled devices, it has become possible to crowdsource the collection of
speed information to road users (e.g. through mobile applications or dedicated
in-vehicle devices). Despite its rather wide spatial coverage, crowdsourced
speed data also brings very important challenges, such as the highly variable
measurement noise in the data due to a variety of driving behaviors and sample
sizes. When not properly accounted for, this noise can severely compromise any
application that relies on accurate traffic data. In this article, we propose
the use of heteroscedastic Gaussian processes (HGP) to model the time-varying
uncertainty in large-scale crowdsourced traffic data. Furthermore, we develop a
HGP conditioned on sample size and traffic regime (SRC-HGP), which makes use of
sample size information (probe vehicles per minute) as well as previous
observed speeds, in order to more accurately model the uncertainty in observed
speeds. Using 6 months of crowdsourced traffic data from Copenhagen, we
empirically show that the proposed heteroscedastic models produce significantly
better predictive distributions when compared to current state-of-the-art
methods for both speed imputation and short-term forecasting tasks.Comment: 22 pages, Transportation Research Part C: Emerging Technologies
(Elsevier
Estimating Latent Demand of Shared Mobility through Censored Gaussian Processes
Transport demand is highly dependent on supply, especially for shared
transport services where availability is often limited. As observed demand
cannot be higher than available supply, historical transport data typically
represents a biased, or censored, version of the true underlying demand
pattern. Without explicitly accounting for this inherent distinction,
predictive models of demand would necessarily represent a biased version of
true demand, thus less effectively predicting the needs of service users. To
counter this problem, we propose a general method for censorship-aware demand
modeling, for which we devise a censored likelihood function. We apply this
method to the task of shared mobility demand prediction by incorporating the
censored likelihood within a Gaussian Process model, which can flexibly
approximate arbitrary functional forms. Experiments on artificial and
real-world datasets show how taking into account the limiting effect of supply
on demand is essential in the process of obtaining an unbiased predictive model
of user demand behavior.Comment: 21 pages, 10 figure
Online Predictive Optimization Framework for Stochastic Demand-Responsive Transit Services
This study develops an online predictive optimization framework for
dynamically operating a transit service in an area of crowd movements. The
proposed framework integrates demand prediction and supply optimization to
periodically redesign the service routes based on recently observed demand. To
predict demand for the service, we use Quantile Regression to estimate the
marginal distribution of movement counts between each pair of serviced
locations. The framework then combines these marginals into a joint demand
distribution by constructing a Gaussian copula, which captures the structure of
correlation between the marginals. For supply optimization, we devise a linear
programming model, which simultaneously determines the route structure and the
service frequency according to the predicted demand. Importantly, our framework
both preserves the uncertainty structure of future demand and leverages this
for robust route optimization, while keeping both components decoupled. We
evaluate our framework using a real-world case study of autonomous mobility in
a university campus in Denmark. The results show that our framework often
obtains the ground truth optimal solution, and can outperform conventional
methods for route optimization, which do not leverage full predictive
distributions.Comment: 34 pages, 12 figures, 5 table
Variational Optimisation for Non-conjugate Likelihood Gaussian Process Models
In this thesis we address the problems associated to non-conjugate likelihood Gaussian process models, i.e., probabilistic models where the likelihood function and the Gaussian process priors are non-conjugate. Such problems include intractability, scalability, and poor local optima solutions for the parameters and hyper-parameters of the models. Particularly, in this thesis we address the aforementioned issues in the context of probabilistic models, where the likelihoodâs parameters are modelled as latent parameter functions drawn from correlated Gaussian processes. We study three ways to generate such latent parameter functions: 1. from a linear model of coregionalisation; 2. from convolution processes, i.e., a convolution integral between smoothing kernels and Gaussian process priors; and 3. using variational inducing kernels, an alternative form to generate the latent parameter functions through the convolution processes formalism, by using a double convolution integral. We borrow ideas from different
variational optimisation mechanisms, that consist on introducing a variational (or exploratory) distribution over the model so as to build objective functions that: allow us to deal with intractability as well as enabling scalability when needing to hand massive amounts of data observations. Also, such variational optimisations mechanisms grant us to perform inference of the model hyper-parameters together with the posteriorâs parameters through a fully natural gradient optimisation scheme; a useful scheme for
tackling the problem of poor local optima solutions. Such variational optimisation mechanisms have been broadly studied in the context of reinforcement and Bayesian deep learning showing to be successful exploratory-learning tools; nonetheless, they have not been much studied in the context of Gaussian process models, so we provide a study of their performance in said context
A Data Fusion CANDECOMP-PARAFAC Method for Interval-wise Missing Network Volume Imputation
Traffic missing data imputation is a fundamental demand and crucial application for real-world intelligent transportation systems. The wide imputation methods in different missing patterns have demonstrated the superiority of tensor learning by effectively characterizing complex spatiotemporal correlations. However, interval-wise missing volume scenarios remain a challenging topic, in particular for long-term continuous missing and high-dimensional data with complex missing mechanisms and patterns. In this paper, we propose a customized tensor decomposition framework, named the data fusion CANDECOMP/PARAFAC (DFCP) tensor decomposition, to combine vehicle license plate recognition (LPR) data and cellphone location (CL) data for the interval-wise missing volume imputation on urban networks. Benefiting from the unique advantages of CL data in the wide spatiotemporal coverage and correlates highly with real-world traffic states, it is fused into vehicle license plate recognition (LPR) data imputation. They are regarded as data types dimension, combined with other dimensions (different segments, time, days), we innovatively design a 4-way low-n-rank tensor decomposition for data reconstruction. Furthermore, to deal with the diverse disturbances in different data dimensions, we derive a regularization penalty coefficient in data imputation. Different from existing regularization schemes, we further introduce Bayesian optimization (BO) to enhance the performance in the non-convexity of the objective function in our regularized hyperparametric solutions during tensor decomposition. Numerical experiments highlight that our proposed method, combining CL and LPR data, significantly outperforms the imputation method using LPR data only. And a sensitivity analysis with varying missing length and rate scenarios demonstrates the robustness of model performance
Learning Behavior Models for Interpreting and Predicting Traffic Situations
In this thesis, we present Bayesian state estimation and machine learning methods for predicting traffic situations. The cognitive ability to assess situations and behaviors of traffic participants, and to anticipate possible developments is an essential requirement for several applications in the traffic domain, especially for self-driving cars. We present a method for learning behavior models from unlabeled traffic observations and develop improved learning methods for decision trees
A supervised learning framework in the context of multiple annotators
The increasing popularity of crowdsourcing platforms, i.e., Amazon Mechanical Turk, is changing how datasets for supervised learning are built. In these cases, instead of having datasets labeled by one source (which is supposed to be an expert who provided the absolute gold standard), we have datasets labeled by multiple annotators with different and unknown expertise. Hence, we face a multi-labeler scenario, which typical supervised learning models cannot tackle. For such a reason, much attention has recently been given to the approaches that capture multiple annotatorsâ wisdom. However, such methods residing on two key assumptions: the labelerâs performance does not depend on the input space and independence among the annotators, which are hardly feasible in real-world settings..