18,236 research outputs found

    Modelling public transport accessibility with Monte Carlo stochastic simulations: A case study of Ostrava

    Get PDF
    Activity-based micro-scale simulation models for transport modelling provide better evaluations of public transport accessibility, enabling researchers to overcome the shortage of reliable real-world data. Current simulation systems face simplifications of personal behaviour, zonal patterns, non-optimisation of public transport trips (choice of the fastest option only), and do not work with real targets and their characteristics. The new TRAMsim system uses a Monte Carlo approach, which evaluates all possible public transport and walking origin-destination (O-D) trips for k-nearest stops within a given time interval, and selects appropriate variants according to the expected scenarios and parameters derived from local surveys. For the city of Ostrava, Czechia, two commuting models were compared based on simulated movements to reach (a) randomly selected large employers and (b) proportionally selected employers using an appropriate distance-decay impedance function derived from various combinations of conditions. The validation of these models confirms the relevance of the proportional gravity-based model. Multidimensional evaluation of the potential accessibility of employers elucidates issues in several localities, including a high number of transfers, high total commuting time, low variety of accessible employers and high pedestrian mode usage. The transport accessibility evaluation based on synthetic trips offers an improved understanding of local situations and helps to assess the impact of planned changes.Web of Science1124art. no. 709

    An investigation into machine learning approaches for forecasting spatio-temporal demand in ride-hailing service

    Full text link
    In this paper, we present machine learning approaches for characterizing and forecasting the short-term demand for on-demand ride-hailing services. We propose the spatio-temporal estimation of the demand that is a function of variable effects related to traffic, pricing and weather conditions. With respect to the methodology, a single decision tree, bootstrap-aggregated (bagged) decision trees, random forest, boosted decision trees, and artificial neural network for regression have been adapted and systematically compared using various statistics, e.g. R-square, Root Mean Square Error (RMSE), and slope. To better assess the quality of the models, they have been tested on a real case study using the data of DiDi Chuxing, the main on-demand ride hailing service provider in China. In the current study, 199,584 time-slots describing the spatio-temporal ride-hailing demand has been extracted with an aggregated-time interval of 10 mins. All the methods are trained and validated on the basis of two independent samples from this dataset. The results revealed that boosted decision trees provide the best prediction accuracy (RMSE=16.41), while avoiding the risk of over-fitting, followed by artificial neural network (20.09), random forest (23.50), bagged decision trees (24.29) and single decision tree (33.55).Comment: Currently under review for journal publicatio

    Mining large-scale human mobility data for long-term crime prediction

    Full text link
    Traditional crime prediction models based on census data are limited, as they fail to capture the complexity and dynamics of human activity. With the rise of ubiquitous computing, there is the opportunity to improve such models with data that make for better proxies of human presence in cities. In this paper, we leverage large human mobility data to craft an extensive set of features for crime prediction, as informed by theories in criminology and urban studies. We employ averaging and boosting ensemble techniques from machine learning, to investigate their power in predicting yearly counts for different types of crimes occurring in New York City at census tract level. Our study shows that spatial and spatio-temporal features derived from Foursquare venues and checkins, subway rides, and taxi rides, improve the baseline models relying on census and POI data. The proposed models achieve absolute R^2 metrics of up to 65% (on a geographical out-of-sample test set) and up to 89% (on a temporal out-of-sample test set). This proves that, next to the residential population of an area, the ambient population there is strongly predictive of the area's crime levels. We deep-dive into the main crime categories, and find that the predictive gain of the human dynamics features varies across crime types: such features bring the biggest boost in case of grand larcenies, whereas assaults are already well predicted by the census features. Furthermore, we identify and discuss top predictive features for the main crime categories. These results offer valuable insights for those responsible for urban policy or law enforcement

    Modeling, Predicting and Capturing Human Mobility

    Get PDF
    Realistic models of human mobility are critical for modern day applications, specifically for recommendation systems, resource planning and process optimization domains. Given the rapid proliferation of mobile devices equipped with Internet connectivity and GPS functionality today, aggregating large sums of individual geolocation data is feasible. The thesis focuses on methodologies to facilitate data-driven mobility modeling by drawing parallels between the inherent nature of mobility trajectories, statistical physics and information theory. On the applied side, the thesis contributions lie in leveraging the formulated mobility models to construct prediction workflows by adopting a privacy-by-design perspective. This enables end users to derive utility from location-based services while preserving their location privacy. Finally, the thesis presents several approaches to generate large-scale synthetic mobility datasets by applying machine learning approaches to facilitate experimental reproducibility

    Scalable Population Synthesis with Deep Generative Modeling

    Full text link
    Population synthesis is concerned with the generation of synthetic yet realistic representations of populations. It is a fundamental problem in the modeling of transport where the synthetic populations of micro-agents represent a key input to most agent-based models. In this paper, a new methodological framework for how to 'grow' pools of micro-agents is presented. The model framework adopts a deep generative modeling approach from machine learning based on a Variational Autoencoder (VAE). Compared to the previous population synthesis approaches, including Iterative Proportional Fitting (IPF), Gibbs sampling and traditional generative models such as Bayesian Networks or Hidden Markov Models, the proposed method allows fitting the full joint distribution for high dimensions. The proposed methodology is compared with a conventional Gibbs sampler and a Bayesian Network by using a large-scale Danish trip diary. It is shown that, while these two methods outperform the VAE in the low-dimensional case, they both suffer from scalability issues when the number of modeled attributes increases. It is also shown that the Gibbs sampler essentially replicates the agents from the original sample when the required conditional distributions are estimated as frequency tables. In contrast, the VAE allows addressing the problem of sampling zeros by generating agents that are virtually different from those in the original data but have similar statistical properties. The presented approach can support agent-based modeling at all levels by enabling richer synthetic populations with smaller zones and more detailed individual characteristics.Comment: 27 pages, 15 figures, 4 table

    On the Need for a Language Describing Distribution Shifts: Illustrations on Tabular Datasets

    Full text link
    Different distribution shifts require different algorithmic and operational interventions. Methodological research must be grounded by the specific shifts they address. Although nascent benchmarks provide a promising empirical foundation, they implicitly focus on covariate shifts, and the validity of empirical findings depends on the type of shift, e.g., previous observations on algorithmic performance can fail to be valid when the YXY|X distribution changes. We conduct a thorough investigation of natural shifts in 5 tabular datasets over 86,000 model configurations, and find that YXY|X-shifts are most prevalent. To encourage researchers to develop a refined language for distribution shifts, we build WhyShift, an empirical testbed of curated real-world shifts where we characterize the type of shift we benchmark performance over. Since YXY|X-shifts are prevalent in tabular settings, we identify covariate regions that suffer the biggest YXY|X-shifts and discuss implications for algorithmic and data-based interventions. Our testbed highlights the importance of future research that builds an understanding of how distributions differ.Comment: 41 page

    Inferring Socioeconomic Characteristics from Travel Patterns

    Get PDF
    Nowadays, crowd-based big data is widely used in transportation planning. These data sources provide valuable information for model validation; however, they cannot be used to estimate travel demand forecasting models, because these models need a linkage between travel patterns and the socioeconomic characteristics of the people making trips and such a connection is not available due to privacy issues. As such, uncovering the correlation between travel patterns and socioeconomic characteristics is crucial for travel demand modelers to be able to leverage such data in model estimation. Different age, gender, and income groups may have specific travel behavior preferences. To extract and investigate these patterns, we used two data sets: one from the National Household Travel Survey 2009 and the other from the Metropolitan Washington Council of Government Transportation Planning Board 2007-2008 household survey. After preprocessing the data, a range of machine learning algorithms were used to synthesize the socioeconomic characteristics of travelers. After comparison, we found that the CatBoost model outperformed the other models. To further improve the results, a synthetic population and Bayesian updating were used, which considerably improved the estimation of income. This study showed that the conventional inference of travel demand from socioeconomic patterns can be reversed, creating an opportunity to utilize the plethora of crowd-based mobility data

    Inferring Socioeconomic Characteristics from Travel Patterns

    Get PDF
    Nowadays, crowd-based big data is widely used in transportation planning. These data sources provide valuable information for model validation; however, they cannot be used to estimate travel demand forecasting models, because these models need a linkage between travel patterns and the socioeconomic characteristics of the people making trips and such a connection is not available due to privacy issues. As such, uncovering the correlation between travel patterns and socioeconomic characteristics is crucial for travel demand modelers to be able to leverage such data in model estimation. Different age, gender, and income groups may have specific travel behavior preferences. To extract and investigate these patterns, we used two data sets: one from the National Household Travel Survey 2009 and the other from the Metropolitan Washington Council of Government Transportation Planning Board 2007-2008 household survey. After preprocessing the data, a range of machine learning algorithms were used to synthesize the socioeconomic characteristics of travelers. After comparison, we found that the CatBoost model outperformed the other models. To further improve the results, a synthetic population and Bayesian updating were used, which considerably improved the estimation of income. This study showed that the conventional inference of travel demand from socioeconomic patterns can be reversed, creating an opportunity to utilize the plethora of crowd-based mobility data

    An original framework for understanding human actions and body language by using deep neural networks

    Get PDF
    The evolution of both fields of Computer Vision (CV) and Artificial Neural Networks (ANNs) has allowed the development of efficient automatic systems for the analysis of people's behaviour. By studying hand movements it is possible to recognize gestures, often used by people to communicate information in a non-verbal way. These gestures can also be used to control or interact with devices without physically touching them. In particular, sign language and semaphoric hand gestures are the two foremost areas of interest due to their importance in Human-Human Communication (HHC) and Human-Computer Interaction (HCI), respectively. While the processing of body movements play a key role in the action recognition and affective computing fields. The former is essential to understand how people act in an environment, while the latter tries to interpret people's emotions based on their poses and movements; both are essential tasks in many computer vision applications, including event recognition, and video surveillance. In this Ph.D. thesis, an original framework for understanding Actions and body language is presented. The framework is composed of three main modules: in the first one, a Long Short Term Memory Recurrent Neural Networks (LSTM-RNNs) based method for the Recognition of Sign Language and Semaphoric Hand Gestures is proposed; the second module presents a solution based on 2D skeleton and two-branch stacked LSTM-RNNs for action recognition in video sequences; finally, in the last module, a solution for basic non-acted emotion recognition by using 3D skeleton and Deep Neural Networks (DNNs) is provided. The performances of RNN-LSTMs are explored in depth, due to their ability to model the long term contextual information of temporal sequences, making them suitable for analysing body movements. All the modules were tested by using challenging datasets, well known in the state of the art, showing remarkable results compared to the current literature methods
    corecore