56,537 research outputs found

    A MapReduce-based nearest neighbor approach for big-data-driven traffic flow prediction

    Full text link
    In big-data-driven traffic flow prediction systems, the robustness of prediction performance depends on accuracy and timeliness. This paper presents a new MapReduce-based nearest neighbor (NN) approach for traffic flow prediction using correlation analysis (TFPC) on a Hadoop platform. In particular, we develop a real-time prediction system including two key modules, i.e., offline distributed training (ODT) and online parallel prediction (OPP). Moreover, we build a parallel k-nearest neighbor optimization classifier, which incorporates correlation information among traffic flows into the classification process. Finally, we propose a novel prediction calculation method, combining the current data observed in OPP and the classification results obtained from large-scale historical data in ODT, to generate traffic flow prediction in real time. The empirical study on real-world traffic flow big data using the leave-one-out cross validation method shows that TFPC significantly outperforms four state-of-the-art prediction approaches, i.e., autoregressive integrated moving average, Naïve Bayes, multilayer perceptron neural networks, and NN regression, in terms of accuracy, which can be improved 90.07% in the best case, with an average mean absolute percent error of 5.53%. In addition, it displays excellent speedup, scaleup, and sizeup

    Travel time estimation in congested urban networks using point detectors data

    Get PDF
    A model for estimating travel time on short arterial links of congested urban networks, using currently available technology, is introduced in this thesis. The objective is to estimate travel time, with an acceptable level of accuracy for real-life traffic problems, such as congestion management and emergency evacuation. To achieve this research objective, various travel time estimation methods, including highway trajectories, multiple linear regression (MLR), artificial neural networks (ANN) and K –nearest neighbor (K-NN) were applied and tested on the same dataset. The results demonstrate that ANN and K-NN methods outperform linear methods by a significant margin, also, show particularly good performance in detecting congested intervals. To ensure the quality of the analysis results, set of procedures and algorithms based on traffic flow theory and test field information, were introduced to validate and clean the data used to build, train and test the different models
    • …
    corecore