453 research outputs found
Non-Employment Activity Type Imputation from Points of Interest and Mobility Data at an Individual Level: How Accurate Can We Get?
Human activity type inference has long been the focus for applications ranging from
managing transportation demand to monitoring changes in land use patterns. Today’s ever increasing
volume of mobility data allow researchers to explore a wide range of methodological approaches
for this task. Such data, however, lack reference observations that would allow the validation of
methodological approaches. This research proposes a methodological framework for urban activity
type inference using a Dirichlet multinomial dynamic Bayesian network with an empirical Bayes prior
that can be applied to mobility data of low spatiotemporal resolution. The method was validated
using open source Foursquare data under different isochrone configurations. The results provide
evidence of the limits of activity detection accuracy using such data as determined by the Area
Under Receiving Operating Curve (AUROC), log-loss, and accuracy metrics. At the same time,
results demonstrate that a hierarchical modeling framework can provide some flexibility against the
challenges related to the nature of unsupervised activity classification using trajectory variables and
POIs as input
Routine pattern discovery and anomaly detection in individual travel behavior
Discovering patterns and detecting anomalies in individual travel behavior is
a crucial problem in both research and practice. In this paper, we address this
problem by building a probabilistic framework to model individual
spatiotemporal travel behavior data (e.g., trip records and trajectory data).
We develop a two-dimensional latent Dirichlet allocation (LDA) model to
characterize the generative mechanism of spatiotemporal trip records of each
traveler. This model introduces two separate factor matrices for the spatial
dimension and the temporal dimension, respectively, and use a two-dimensional
core structure at the individual level to effectively model the joint
interactions and complex dependencies. This model can efficiently summarize
travel behavior patterns on both spatial and temporal dimensions from very
sparse trip sequences in an unsupervised way. In this way, complex travel
behavior can be modeled as a mixture of representative and interpretable
spatiotemporal patterns. By applying the trained model on future/unseen
spatiotemporal records of a traveler, we can detect her behavior anomalies by
scoring those observations using perplexity. We demonstrate the effectiveness
of the proposed modeling framework on a real-world license plate recognition
(LPR) data set. The results confirm the advantage of statistical learning
methods in modeling sparse individual travel behavior data. This type of
pattern discovery and anomaly detection applications can provide useful
insights for traffic monitoring, law enforcement, and individual travel
behavior profiling
Estimating Movement from Mobile Telephony Data
Mobile enabled devices are ubiquitous in modern society. The information gathered by
their normal service operations has become one of the primary data sources used in the
understanding of human mobility, social connection and information transfer. This thesis
investigates techniques that can extract useful information from anonymised call detail records
(CDR). CDR consist of mobile subscriber data related to people in connection with the network
operators, the nature of their communication activity (voice, SMS, data, etc.), duration of the
activity and starting time of the activity and servicing cell identification numbers of both the
sender and the receiver when available.
The main contributions of the research are a methodology for distance measurements
which enables the identification of mobile subscriber travel paths and a methodology for
population density estimation based on significant mobile subscriber regions of interest. In
addition, insights are given into how a mobile network operator may use geographically located
subscriber data to create new revenue streams and improved network performance. A range of
novel algorithms and techniques underpin the development of these methodologies. These
include, among others, techniques for CDR feature extraction, data visualisation and CDR data
cleansing.
The primary data source used in this body of work was the CDR of Meteor, a mobile
network operator in the Republic of Ireland. The Meteor network under investigation has just
over 1 million customers, which represents approximately a quarter of the country’s 4.6 million
inhabitants, and operates using both 2G and 3G cellular telephony technologies.
Results show that the steady state vector analysis of modified Markov chain mobility
models can return population density estimates comparable to population estimates obtained
through a census. Evaluated using a test dataset, results of travel path identification showed
that developed distance measurements achieved greater accuracy when classifying the routes
CDR journey trajectories took compared to traditional trajectory distance measurements.
Results from subscriber segmentation indicate that subscribers who have perceived similar
relationships to geographical features can be grouped based on weighted steady state mobility
vectors. Overall, this thesis proposes novel algorithms and techniques for the estimation of
movement from mobile telephony data addressing practical issues related to sampling, privacy
and spatial uncertainty
Short-term traffic predictions on large urban traffic networks: applications of network-based machine learning models and dynamic traffic assignment models
The paper discusses the issues to face in applications of short-term traffic predictions on urban road networks and the opportunities provided by explicit and implicit models. Different specifications of Bayesian Networks and Artificial Neural Networks are applied for prediction of road link speed and are tested on a large floating car data set. Moreover, two traffic assignment models of different complexity are applied
on a sub-area of the road network of Rome and validated on the same floating car data set
Modelling individual accessibility using Bayesian networks: A capabilities approach
The ability of an individual to reach and engage with basic services such as healthcare, education and activities such as employment is a fundamental aspect of their wellbeing. Within transport studies, accessibility is considered to be a valuable concept that can be used to generate insights on issues related to social exclusion due to limited access to transport options. Recently, researchers have attempted to link accessibility with popular theories of social justice such as Amartya Sen's Capabilities Approach (CA). Such studies have set the theoretical foundations on the way accessibility can be expressed through the CA, however, attempts to operationalise this approach remain fragmented and predominantly qualitative in nature. The data landscape however, has changed over the last decade providing an unprecedented quantity of transport related data at an individual level. Mobility data from dfferent sources have the potential to contribute to the understanding of individual accessibility and its relation to phenomena such as social exclusion. At the same time, the unlabelled nature of such data present a considerable challenge, as a non-trivial step of inference is required if one is to deduce the transportation modes used and activities reached. This thesis develops a novel framework for accessibility modelling using the CA as theoretical foundation. Within the scope of this thesis, this is used to assess the levels of equality experienced by individuals belonging to different population groups and its link to transport related social exclusion. In the proposed approach, activities reached and transportation modes used are considered manifestations of individual hidden capabilities. A modelling framework using dynamic Bayesian networks is developed to quantify and assess the relationships and dynamics of the different components in fluencing the capabilities sets. The developed approach can also provide inferential capabilities for activity type and transportation mode detection, making it suitable for use with unlabelled mobility data such as Automatic Fare Collection Systems (AFC), mobile phone and social media. The usefulness of the proposed framework is demonstrated through three case studies. In the first case study, mobile phone data were used to explore the interaction of individuals with different public transportation modes. It was found that assumptions about individual mobility preferences derived from travel surveys may not always hold, providing evidence for the significance of personal characteristics to the choices of transportation modes. In the second case, the proposed framework is used for activity type inference, testing the limits of accuracy that can be achieved from unlabelled social media data. A combination of the previous case studies, the third case further defines a generative model which is used to develop the proposed capabilities approach to accessibility model. Using data from London's Automatic Fare Collection Systems (AFC) system, the elements of the capabilities set are explicitly de ned and linked with an individual's personal characteristics, external variables and functionings. The results are used to explore the link between social exclusion and transport disadvantage, revealing distinct patterns that can be attributed to different accessibility levels
A Markov Chain Monte Carlo Approach for Estimating Daily Activity Patterns
Determining the purpose of trips brings is a fundamental information to evaluate travel demand during the day and to predict longer-term impacts on the population’s travel behavior. The concept of tours is the most suited to consider the value of a daily scheduling of individuals and travel interdependencies. However, the meticulous care required for both collecting data of high quality and interpret results of advanced demand models are frequently considered as major drawbacks. The objective of this study is to incorporate into a standard trip-based model some inherent concepts of activity-based models in order to enhance the representation of travel behavior. The main focus of this work is to infer, employing utility theory, the trip purpose of a population, at a zonal level. Making use of Markov Chain Monte Carlo, a set of parameters is estimated in order to retrieve tour-based primitives of the demand. The main advantage of this methodology is the low requirements in terms of data, as no individual information are used, and the good interpretation of the model. Estimated parameters of the priors set a utility-based probability function for departure time, which allows to have a dynamic overview of the demand. In order to account for the tour consistency of travel decisions, a duration constraint is added to the model. The proposed model is applied to the region of Luxembourg city and the results show the potential of the methodologies for dividing an observed demand based on the activity at destination
Modeling Individual Activity and Mobility Behavior and Assessing Ridesharing Impacts Using Emerging Data Sources
Predicting individual mobility behavior is one of the major steps of transportation planning models. Accurate prediction of individual mobility behavior will be beneficial for transportation planning. Although previous studies have used different data sources to model individual mobility behaviors, they have several limitations such as the lack of complete mobility sequences and travel mode information, limiting our ability to accurately predict individual movements. In recent years, the emergence of GPS-based floating car data (FCD) and on-demand ride-hailing service platforms can provide innovative data sources to understand and model individual mobility behavior. Compared to the previously used data sources such as mobile phone and social media data, mobility data extracted of the new data sources contain more specific, detailed, and longitudinal information of individual travel mode and coordinates of the visited locations. This dissertation explores the potential of using GPS-based FCD and on-demand ride-hailing service data with different modeling techniques towards understanding and predicting individual mobility and activity behaviors and assessing the ridesharing impacts through three studies
Transport systems analysis : models and data
Funding: This research project has been funded by Spanish R+D Programs, specifcally under Grant PID2020-112967GB-C31.Rapid advancements in new technologies, especially information and communication technologies (ICT), have significantly increased the number of sensors that capture data, namely those embedded in mobile devices. This wealth of data has garnered particular interest in analyzing transport systems, with some researchers arguing that the data alone are sufficient enough to render transport models unnecessary. However, this paper takes a contrary position and holds that models and data are not mutually exclusive but rather depend upon each other. Transport models are built upon established families of optimization and simulation approaches, and their development aligns with the scientific principles of operations research, which involves acquiring knowledge to derive modeling hypotheses. We provide an overview of these modeling principles and their application to transport systems, presenting numerous models that vary according to study objectives and corresponding modeling hypotheses. The data required for building, calibrating, and validating selected models are discussed, along with examples of using data analytics techniques to collect and handle the data supplied by ICT applications. The paper concludes with some comments on current and future trends
Developing Travel Behaviour Models Using Mobile Phone Data
Improving the performance and efficiency of transport systems requires sound decision-making supported by data and models. However, conducting travel surveys to facilitate travel behaviour model estimation is an expensive venture. Hence, such surveys are typically infrequent in nature, and cover limited sample sizes. Furthermore, the quality of such data is often affected by reporting errors and changes in the respondents’ behaviour due to awareness of being observed. On the other hand, large and diverse quantities of time-stamped location data are nowadays passively generated as a by-product of technological growth. These passive data sources include Global Positioning System (GPS) traces, mobile phone network records, smart card data and social media data, to name but a few. Among these, mobile phone network records (i.e. call detail records (CDRs) and Global Systems for Mobile Communication (GSM) data) offer the biggest promise due to the increasing mobile phone penetration rates in both the developed and the developing worlds. Previous studies using mobile phone data have primarily focused on extracting travel patterns and trends rather than establishing mathematical relationships between the observed behaviour and the causal factors to predict the travel behaviour in alternative policy scenarios.
This research aims to extend the application of mobile phone data to travel behaviour modelling and policy analysis by augmenting the data with information derived from other sources. This comes along with significant challenges stemming from the anonymous and noisy nature of the data. Consequently, novel data fusion and modelling frameworks have been developed and tested for different modelling scenarios to demonstrate the potential of this emerging low-cost data source.
In the context of trip generation, a hybrid modelling framework has been developed to account for the anonymous nature of CDR data. This involves fusing the CDR and demographic data of a sub-sample of the users to estimate a demographic prediction sub-model based on phone usage variables extracted from the data. The demographic group membership probabilities from this model are then used as class weights in a latent class model for trip generation based on trip rates extracted from the GSM data of the same users. Once estimated, the hybrid model can be applied to probabilistically infer the socio-demographics, and subsequently, the trip generation of a large proportion of the population where only large-scale anonymous CDR data is available as an input. The estimation and validation results using data from Switzerland show that the hybrid model competes well against a typical trip generation model estimated using data with known socio-demographics of the users. The hybrid framework can be applied to other travel behaviour modelling contexts using CDR data (in mode or route choice for instance).
The potential of CDR data to capture rational route choice behaviour for long-distance inter-regional O-D pairs (joined by highly overlapping routes) is demonstrated through data fusion with information on the attributes of the alternatives extracted from multiple external sources. The effect of location discontinuities in CDR data (due to its event-driven nature), and how this impacts the ability to observe the users’ trajectories in a highly overlapping network is discussed prompting the development of a route identification algorithm that distinguishes between unique and broad sub-group route choices. The broad choice framework, which was developed in the context of vehicle type choice is then adapted to leverage this limitation where unique route choices cannot be observed for some users, and only the broad sub-groups of the possible overlapping routes are identifiable. The estimation and validation results using data from Senegal show that CDR data can capture rational route choice behaviour, as well as reasonable value of travel time estimates.
Still relying on data fusion, a novel method based on the mixed logit framework is developed to enable the analysis of departure time choice behaviour using passively collected data (GSM and GPS data) where the challenge is to deal with the lack of information on the desired times of travel. The proposed method relies on data fusion with travel time information extracted from Google Maps in the context of Switzerland. It is unique in the sense that it allows the modeller to understand the sensitivity attached to schedule delay, thus enabling its valuation, despite the passive nature of the data. The model results are in line with the expected travel behaviour, and the schedule delay valuation estimates are reasonable for the study area.
Finally, a joint trip generation modelling framework fusing CDR, household travel survey, and census data is developed. The framework adjusts the scaling factors of a traditional trip generation model (based on household travel survey data only) to optimise model performance at both the disaggregate and aggregate levels. The framework is calibrated using data from Bangladesh and the adjusted models are found to have better spatial and temporal transferability.
Thus, besides demonstrating the potential of mobile phone data, the thesis makes significant methodological and applied contributions. The use of different datasets provides rich insights that can inform policy measures related to the adoption of big data for transport studies. The research findings are particularly timely for transport agencies and practitioners working in contexts with severe data limitations (especially in developing countries), as well as academics generally interested in exploring the potential of emerging big data sources, both in transport and beyond
- …