You are how you travel: A multi-task learning framework for Geodemographic inference using transit smart card data

Abstract

Geodemographics, providing the information of population's characteristics in the regions on a geographical basis, is of immense importance in urban studies, public policy-making, social research and business, among others. Such data, however, are difficult to collect from the public, which is usually done via census, with a low update frequency. In urban areas, with the increasing prevalence of public transit equipped with automated fare payment systems, researchers can collect massive transit smart card (SC) data from a large population. The SC data record human daily activities at an individual level with high spatial and temporal resolutions. It can reveal frequent activity areas (e.g., residential areas) and travel behaviours of passengers that are intimately intertwined with personal interests and characteristics. This provides new opportunities for geodemographic study. This paper seeks to develop a framework to infer travellers' demographics (such as age, income level and car ownership, et al.) and their residential areas for geodemographic mapping using SC data with a household survey. We first use a decision tree diagram to detect passengers' residential areas. We then represent each individual's spatio-temporal activity pattern derived from multi-week SC data as a 2D image. Leveraging this representation, a multi-task convolutional neural network (CNN) is employed to predict multiple demographics of individuals from the images. Combing the demographics and locations of their residence, geodemographic information is further obtained. The methodology is applied to a large-scale SC dataset provided by Transport for London. Results provide new insights in understanding the relationship between human activity patterns and demographics. To the best of our knowledge, this is the first attempt to infer geodemographics by using the SC data

    Similar works