Big data for activity based transport models

Hajduk, Petr

Big data for activity based transport models

Authors: Petr Hajduk
Publication date: 10 December 2018
Publisher

Abstract

Our civilization needs to know as much information about itself as possible in order to keep running. One of the important fields is the field of transportation and since we could not measure all the movements happening on planet Earth, we need transport modelling. As of 2018, for the area of a metropolis the four-step model still seems to be a state of practice of modelling transportation. This comes with several disadvantages such as lack of detail (aggregation to zones) or oversimplifying of the travel demand phenomena (trips are not combined into daily schedules). To remedy these disadvantages, the scientific community came up with activity-based models that addressed those issues. The in-creased level of detail has however increased the demand for data. Nowadays the data is obtained from costly travel surveys that make the methodology less viable option for the practitioners. Therefore, in this thesis the focus are possible new sources of data for the model and using the open datasets to build an activity-based model. First, we examine the existing big data sources and evaluate their usefulness for the model. As a result of this evaluation, we carry on to create synthetic data handling the movements of the studied population, as no big data source related to movement of people was found useful for creating the data suitable for the model. We used the Capital region of Helsinki, Finland as a region for the case study to deal with the real data environment. The data is generated by disaggregation of statistical data aiming at preserving the variability in a maximum achievable way. Where needed, assumptions are used to forward the process. Using the synthetic big data a transport model was created. Despite the fact that the ac-curacy of the model in terms of error on link volumes does not reach the level of some other previously developed models, it is still surprisingly precise regarding the idea that solely open data and statistics were used. In the discussion possible synergies with other big datasets is described with respect to the experiences from the case study