Traffic Flow Prediction Using Convolutional Neural Network accelerated by Spark Distributed Cluster

Abstract

Obtain information from historical data to forecast traffic flow in a city can be difficult because a precision forecasting demands large amount of data and accurate pattern analysis. Meanwhile, it is also meaningful because it provides a detailed and accurate point-to-point prediction for users. In this project, I use CNN (Convolutional Neural Network) to train the model based on the images captured by webcams in New York City. Then I deploy the training process on a Spark distributed Cluster so that the whole training process is accelerated. To efficiently combine CNN and Apache Spark, the prediction model is re-designed and optimized, and the distributed cluster is tuned. By using 5-fold validation, multiple test results are presented to provides a support for the analysis about the model optimization and distributed cluster tuning. The aim of this project is to find the most accurate prediction model for the traffic flow prediction with acceptable time cost

    Similar works