5,033 research outputs found
Short-Term Forecasting of Passenger Demand under On-Demand Ride Services: A Spatio-Temporal Deep Learning Approach
Short-term passenger demand forecasting is of great importance to the
on-demand ride service platform, which can incentivize vacant cars moving from
over-supply regions to over-demand regions. The spatial dependences, temporal
dependences, and exogenous dependences need to be considered simultaneously,
however, which makes short-term passenger demand forecasting challenging. We
propose a novel deep learning (DL) approach, named the fusion convolutional
long short-term memory network (FCL-Net), to address these three dependences
within one end-to-end learning architecture. The model is stacked and fused by
multiple convolutional long short-term memory (LSTM) layers, standard LSTM
layers, and convolutional layers. The fusion of convolutional techniques and
the LSTM network enables the proposed DL approach to better capture the
spatio-temporal characteristics and correlations of explanatory variables. A
tailored spatially aggregated random forest is employed to rank the importance
of the explanatory variables. The ranking is then used for feature selection.
The proposed DL approach is applied to the short-term forecasting of passenger
demand under an on-demand ride service platform in Hangzhou, China.
Experimental results, validated on real-world data provided by DiDi Chuxing,
show that the FCL-Net achieves better predictive performance than traditional
approaches including both classical time-series prediction models and neural
network based algorithms (e.g., artificial neural network and LSTM). This paper
is one of the first DL studies to forecast the short-term passenger demand of
an on-demand ride service platform by examining the spatio-temporal
correlations.Comment: 39 pages, 10 figure
Measuring Membership Privacy on Aggregate Location Time-Series
While location data is extremely valuable for various applications,
disclosing it prompts serious threats to individuals' privacy. To limit such
concerns, organizations often provide analysts with aggregate time-series that
indicate, e.g., how many people are in a location at a time interval, rather
than raw individual traces. In this paper, we perform a measurement study to
understand Membership Inference Attacks (MIAs) on aggregate location
time-series, where an adversary tries to infer whether a specific user
contributed to the aggregates.
We find that the volume of contributed data, as well as the regularity and
particularity of users' mobility patterns, play a crucial role in the attack's
success. We experiment with a wide range of defenses based on generalization,
hiding, and perturbation, and evaluate their ability to thwart the attack
vis-a-vis the utility loss they introduce for various mobility analytics tasks.
Our results show that some defenses fail across the board, while others work
for specific tasks on aggregate location time-series. For instance, suppressing
small counts can be used for ranking hotspots, data generalization for
forecasting traffic, hotspot discovery, and map inference, while sampling is
effective for location labeling and anomaly detection when the dataset is
sparse. Differentially private techniques provide reasonable accuracy only in
very specific settings, e.g., discovering hotspots and forecasting their
traffic, and more so when using weaker privacy notions like crowd-blending
privacy. Overall, our measurements show that there does not exist a unique
generic defense that can preserve the utility of the analytics for arbitrary
applications, and provide useful insights regarding the disclosure of sanitized
aggregate location time-series
An investigation into machine learning approaches for forecasting spatio-temporal demand in ride-hailing service
In this paper, we present machine learning approaches for characterizing and
forecasting the short-term demand for on-demand ride-hailing services. We
propose the spatio-temporal estimation of the demand that is a function of
variable effects related to traffic, pricing and weather conditions. With
respect to the methodology, a single decision tree, bootstrap-aggregated
(bagged) decision trees, random forest, boosted decision trees, and artificial
neural network for regression have been adapted and systematically compared
using various statistics, e.g. R-square, Root Mean Square Error (RMSE), and
slope. To better assess the quality of the models, they have been tested on a
real case study using the data of DiDi Chuxing, the main on-demand ride hailing
service provider in China. In the current study, 199,584 time-slots describing
the spatio-temporal ride-hailing demand has been extracted with an
aggregated-time interval of 10 mins. All the methods are trained and validated
on the basis of two independent samples from this dataset. The results revealed
that boosted decision trees provide the best prediction accuracy (RMSE=16.41),
while avoiding the risk of over-fitting, followed by artificial neural network
(20.09), random forest (23.50), bagged decision trees (24.29) and single
decision tree (33.55).Comment: Currently under review for journal publicatio
- …