5,033 research outputs found

    Short-Term Forecasting of Passenger Demand under On-Demand Ride Services: A Spatio-Temporal Deep Learning Approach

    Full text link
    Short-term passenger demand forecasting is of great importance to the on-demand ride service platform, which can incentivize vacant cars moving from over-supply regions to over-demand regions. The spatial dependences, temporal dependences, and exogenous dependences need to be considered simultaneously, however, which makes short-term passenger demand forecasting challenging. We propose a novel deep learning (DL) approach, named the fusion convolutional long short-term memory network (FCL-Net), to address these three dependences within one end-to-end learning architecture. The model is stacked and fused by multiple convolutional long short-term memory (LSTM) layers, standard LSTM layers, and convolutional layers. The fusion of convolutional techniques and the LSTM network enables the proposed DL approach to better capture the spatio-temporal characteristics and correlations of explanatory variables. A tailored spatially aggregated random forest is employed to rank the importance of the explanatory variables. The ranking is then used for feature selection. The proposed DL approach is applied to the short-term forecasting of passenger demand under an on-demand ride service platform in Hangzhou, China. Experimental results, validated on real-world data provided by DiDi Chuxing, show that the FCL-Net achieves better predictive performance than traditional approaches including both classical time-series prediction models and neural network based algorithms (e.g., artificial neural network and LSTM). This paper is one of the first DL studies to forecast the short-term passenger demand of an on-demand ride service platform by examining the spatio-temporal correlations.Comment: 39 pages, 10 figure

    Measuring Membership Privacy on Aggregate Location Time-Series

    Get PDF
    While location data is extremely valuable for various applications, disclosing it prompts serious threats to individuals' privacy. To limit such concerns, organizations often provide analysts with aggregate time-series that indicate, e.g., how many people are in a location at a time interval, rather than raw individual traces. In this paper, we perform a measurement study to understand Membership Inference Attacks (MIAs) on aggregate location time-series, where an adversary tries to infer whether a specific user contributed to the aggregates. We find that the volume of contributed data, as well as the regularity and particularity of users' mobility patterns, play a crucial role in the attack's success. We experiment with a wide range of defenses based on generalization, hiding, and perturbation, and evaluate their ability to thwart the attack vis-a-vis the utility loss they introduce for various mobility analytics tasks. Our results show that some defenses fail across the board, while others work for specific tasks on aggregate location time-series. For instance, suppressing small counts can be used for ranking hotspots, data generalization for forecasting traffic, hotspot discovery, and map inference, while sampling is effective for location labeling and anomaly detection when the dataset is sparse. Differentially private techniques provide reasonable accuracy only in very specific settings, e.g., discovering hotspots and forecasting their traffic, and more so when using weaker privacy notions like crowd-blending privacy. Overall, our measurements show that there does not exist a unique generic defense that can preserve the utility of the analytics for arbitrary applications, and provide useful insights regarding the disclosure of sanitized aggregate location time-series

    An investigation into machine learning approaches for forecasting spatio-temporal demand in ride-hailing service

    Full text link
    In this paper, we present machine learning approaches for characterizing and forecasting the short-term demand for on-demand ride-hailing services. We propose the spatio-temporal estimation of the demand that is a function of variable effects related to traffic, pricing and weather conditions. With respect to the methodology, a single decision tree, bootstrap-aggregated (bagged) decision trees, random forest, boosted decision trees, and artificial neural network for regression have been adapted and systematically compared using various statistics, e.g. R-square, Root Mean Square Error (RMSE), and slope. To better assess the quality of the models, they have been tested on a real case study using the data of DiDi Chuxing, the main on-demand ride hailing service provider in China. In the current study, 199,584 time-slots describing the spatio-temporal ride-hailing demand has been extracted with an aggregated-time interval of 10 mins. All the methods are trained and validated on the basis of two independent samples from this dataset. The results revealed that boosted decision trees provide the best prediction accuracy (RMSE=16.41), while avoiding the risk of over-fitting, followed by artificial neural network (20.09), random forest (23.50), bagged decision trees (24.29) and single decision tree (33.55).Comment: Currently under review for journal publicatio
    corecore