97 research outputs found

    A Simple Baseline for Travel Time Estimation using Large-Scale Trip Data

    Full text link
    The increased availability of large-scale trajectory data around the world provides rich information for the study of urban dynamics. For example, New York City Taxi Limousine Commission regularly releases source-destination information about trips in the taxis they regulate. Taxi data provide information about traffic patterns, and thus enable the study of urban flow -- what will traffic between two locations look like at a certain date and time in the future? Existing big data methods try to outdo each other in terms of complexity and algorithmic sophistication. In the spirit of "big data beats algorithms", we present a very simple baseline which outperforms state-of-the-art approaches, including Bing Maps and Baidu Maps (whose APIs permit large scale experimentation). Such a travel time estimation baseline has several important uses, such as navigation (fast travel time estimates can serve as approximate heuristics for A search variants for path finding) and trip planning (which uses operating hours for popular destinations along with travel time estimates to create an itinerary).Comment: 12 page

    Urban Anomaly Analytics: Description, Detection, and Prediction

    Get PDF
    Urban anomalies may result in loss of life or property if not handled properly. Automatically alerting anomalies in their early stage or even predicting anomalies before happening is of great value for populations. Recently, data-driven urban anomaly analysis frameworks have been forming, which utilize urban big data and machine learning algorithms to detect and predict urban anomalies automatically. In this survey, we make a comprehensive review of the state-of-the-art research on urban anomaly analytics. We first give an overview of four main types of urban anomalies, traffic anomaly, unexpected crowds, environment anomaly, and individual anomaly. Next, we summarize various types of urban datasets obtained from diverse devices, i.e., trajectory, trip records, CDRs, urban sensors, event records, environment data, social media and surveillance cameras. Subsequently, a comprehensive survey of issues on detecting and predicting techniques for urban anomalies is presented. Finally, research challenges and open problems as discussed.Peer reviewe

    A Methodology with Distributed Algorithms for Large-Scale Human Mobility Prediction

    Get PDF
    In today’s era of big data, huge amounts of spatial-temporal data related to human mobility, e.g., vehicle trajectories, are generated daily from all kinds of city-wide infrastructures. Understanding and accurately predicting such a large amount of spatial-temporal data could benefit many real-world applications, e.g., efficient transportation resource relocation. However, the mix of spatial and temporal patterns among these activities and the scale of the data (in a city level) pose great challenges for accurate predictions under real-time constraints. To bridge the gap, this dissertation proposes a methodology for the prediction of large-scale human mobility, especially a city level’s vehicle trajectory distribution across the road network. The thesis has several major components: (1) a novel model for the prediction of spatial-temporal activities such as people’s outflow/inflow movements combining the latent and explicit features; (2) different models for the simulation of corresponding flow trajectory distributions in the road network, from which hot road segments and their formation can be predicted and identified in advance; (3) different MapReduce-based distributed algorithms for the simulation and analysis of large-scale trajectory distributions under real-time constraints. First, our proposed methodology quantifies the latent features of spatial environments and temporal factors through tensor factorization, given existing mobility datasets. We model the relationship between spatial-temporal activities and the latent and other explicit features as a Gaussian process, which can be viewed as a distribution over the possible functions to predict human mobility. After the prediction of overall inflow/outflow, we further model these movements’ trajectory distributions in the road network, from which the corresponding hot road segments and its possible causes, among other things, can be predicted in advance. For example, based on our prediction, in the next half hour, a high percentage of vehicles that travel from region A/B toward region C/D might pass through the same road segment, which indicates that a possible traffic jam or bottleneck could form there later. This process is computationally intensive and would require efficient algorithms for real-time response because the scale of a city’s road network and the possible number of trajectories that people might choose to take during certain time periods could be very large. Efficient distributed algorithms are proposed and validated

    Spatiotemporal Tensor Completion for Improved Urban Traffic Imputation

    Full text link
    Effective management of urban traffic is important for any smart city initiative. Therefore, the quality of the sensory traffic data is of paramount importance. However, like any sensory data, urban traffic data are prone to imperfections leading to missing measurements. In this paper, we focus on inter-region traffic data completion. We model the inter-region traffic as a spatiotemporal tensor that suffers from missing measurements. To recover the missing data, we propose an enhanced CANDECOMP/PARAFAC (CP) completion approach that considers the urban and temporal aspects of the traffic. To derive the urban characteristics, we divide the area of study into regions. Then, for each region, we compute urban feature vectors inspired from biodiversity which are used to compute the urban similarity matrix. To mine the temporal aspect, we first conduct an entropy analysis to determine the most regular time-series. Then, we conduct a joint Fourier and correlation analysis to compute its periodicity and construct the temporal matrix. Both urban and temporal matrices are fed into a modified CP-completion objective function. To solve this objective, we propose an alternating least square approach that operates on the vectorized version of the inputs. We conduct comprehensive comparative study with two evaluation scenarios. In the first one, we simulate random missing values. In the second scenario, we simulate missing values at a given area and time duration. Our results demonstrate that our approach provides effective recovering performance reaching 26% improvement compared to state-of-art CP approaches and 35% compared to state-of-art generative model-based approaches

    Predicting passenger origin-destination in online taxi-hailing systems

    Full text link
    Because of transportation planning, traffic management, and dispatch optimization importance, passenger origin-destination prediction has become one of the most important requirements for intelligent transportation systems management. In this paper, we propose a model to predict the next specified time window travels' origin and destination. To extract meaningful travel flows, we use K-means clustering in four-dimensional space with maximum cluster size limitation for origin and destination zones. Because of the large number of clusters, we use non-negative matrix factorization to decrease the number of travel clusters. Also, we use a stacked recurrent neural network model to predict travel count in each cluster. Comparing our results with other existing models shows that our proposed model has 5-7% lower mean absolute percentage error (MAPE) for 1-hour time windows, and 14% lower MAPE for 30-minute time windows.Comment: 25 pages, 20 figure

    Managing and Analyzing Big Traffic Data-An Uncertain Time Series Approach

    Get PDF
    • …
    corecore