103 research outputs found

    A stigmergy-based analysis of city hotspots to discover trends and anomalies in urban transportation usage

    Full text link
    A key aspect of a sustainable urban transportation system is the effectiveness of transportation policies. To be effective, a policy has to consider a broad range of elements, such as pollution emission, traffic flow, and human mobility. Due to the complexity and variability of these elements in the urban area, to produce effective policies remains a very challenging task. With the introduction of the smart city paradigm, a widely available amount of data can be generated in the urban spaces. Such data can be a fundamental source of knowledge to improve policies because they can reflect the sustainability issues underlying the city. In this context, we propose an approach to exploit urban positioning data based on stigmergy, a bio-inspired mechanism providing scalar and temporal aggregation of samples. By employing stigmergy, samples in proximity with each other are aggregated into a functional structure called trail. The trail summarizes relevant dynamics in data and allows matching them, providing a measure of their similarity. Moreover, this mechanism can be specialized to unfold specific dynamics. Specifically, we identify high-density urban areas (i.e hotspots), analyze their activity over time, and unfold anomalies. Moreover, by matching activity patterns, a continuous measure of the dissimilarity with respect to the typical activity pattern is provided. This measure can be used by policy makers to evaluate the effect of policies and change them dynamically. As a case study, we analyze taxi trip data gathered in Manhattan from 2013 to 2015.Comment: Preprin

    Spatiotemporal analysis of taxi availability and pick-ups: A case study of Suzhou, China

    Get PDF
    This study utilized a seven-day taxi trajectory dataset to investigate the difficulty of finding vacant taxis in Suzhou, China, by analyzing the imbalance (IMB) between rider pick-ups and the number of vacant taxis on each road segment in Suzhou. To recognize significant local high vs. low frequency patterns of events, and to make the values of imbalance as representative as possible, a hierarchical structure of multi-resolution time windows that split each hour into as many as four parts was developed based on the minimum variance method of hierarchical clustering (Ward, 1963). In addition to imbalance, the second variable to be analyzed was the number of time windows (NTW) for each one-hour period. Two tools from ArcGIS ā€œGlobal Spatial Autocorrelation (Moranā€™s I)ā€ and ā€œHot Spot Analysis (Getis-Ord Gi*)ā€ were the main ones used in the analyses of IMB and NTW. During the analyses, the global spatial autocorrelation, the number of hot spots, and the spatial distribution pattern of both variables were inspected. An evenly-distributed spatiotemporal pattern was observed for the NTW hot spots, and an ā€œearly morningā€“daytimeā€“transitionā€“early morningā€ spatiotemporal pattern was observed for the IMB hot spots. These patterns helped clarify the two types of difficulties of finding vacant taxis; i.e., the first type was caused by the low frequency of the event, and the second was caused by the competition among riders. Finally, the results of Pearson correlation analyses indicated that he two types of difficulties existed independently from each other

    Data from mobile phone operators: A tool for smarter cities?

    Get PDF
    Abstract The use of mobile phone data provides new spatio-temporal tools for improving urban planning, and for reducing inefficiencies in present-day urban systems. Data from mobile phones, originally intended as a communication tool, are increasingly used as innovative tools in geography and social sciences research. Empirical studies on complex city systems from human-centred and urban dynamics perspectives provide new insights to develop promising applications for supporting smart city initiatives. This paper provides a comprehensive review and a typology of spatial studies on mobile phone data, and highlights the applicability of such digital data to develop innovative applications for enhanced urban management

    Community Detection in Multimodal Networks

    Get PDF
    Community detection on networks is a basic, yet powerful and ever-expanding set of methodologies that is useful in a variety of settings. This dissertation discusses a range of different community detection on networks with multiple and non-standard modalities. A major focus of analysis is on the study of networks spanning several layers, which represent relationships such as interactions over time, different facets of high-dimensional data. These networks may be represented by several different ways; namely the few-layer (i.e. longitudinal) case as well as the many-layer (time-series cases). In the first case, we develop a novel application of variational expectation maximization as an example of the top-down mode of simultaneous community detection and parameter estimation. In the second case, we use a bottom-up strategy of iterative nodal discovery for these longer time-series, abetted with the assumption of their structural properties. In addition, we explore significantly self-looping networks, whose features are inseparable from the inherent construction of spatial networks whose weights are reflective of distance information. These types of networks are used to model and demarcate geographical regions. We also describe some theoretical properties and applications of a method for finding communities in bipartite networks that are weighted by correlations between samples. We discuss different strategies for community detection in each of these different types of networks, as well as their implications for the broader contributions to the literature. In addition to the methodologies, we also highlight the types of data wherein these ``non-standard" network structures arise and how they are fitting for the applications of the proposed methodologies: particularly spatial networks and multilayer networks. We apply the top-down and bottom-up community detection algorithms to data in the domains of demography, human mobility, genomics, climate science, psychiatry, politics, and neuroimaging. The expansiveness and diversity of these data speak to the flexibility and ubiquity of our proposed methods to all forms of relational data.Doctor of Philosoph

    Developing Travel Behaviour Models Using Mobile Phone Data

    Get PDF
    Improving the performance and efficiency of transport systems requires sound decision-making supported by data and models. However, conducting travel surveys to facilitate travel behaviour model estimation is an expensive venture. Hence, such surveys are typically infrequent in nature, and cover limited sample sizes. Furthermore, the quality of such data is often affected by reporting errors and changes in the respondentsā€™ behaviour due to awareness of being observed. On the other hand, large and diverse quantities of time-stamped location data are nowadays passively generated as a by-product of technological growth. These passive data sources include Global Positioning System (GPS) traces, mobile phone network records, smart card data and social media data, to name but a few. Among these, mobile phone network records (i.e. call detail records (CDRs) and Global Systems for Mobile Communication (GSM) data) offer the biggest promise due to the increasing mobile phone penetration rates in both the developed and the developing worlds. Previous studies using mobile phone data have primarily focused on extracting travel patterns and trends rather than establishing mathematical relationships between the observed behaviour and the causal factors to predict the travel behaviour in alternative policy scenarios. This research aims to extend the application of mobile phone data to travel behaviour modelling and policy analysis by augmenting the data with information derived from other sources. This comes along with significant challenges stemming from the anonymous and noisy nature of the data. Consequently, novel data fusion and modelling frameworks have been developed and tested for different modelling scenarios to demonstrate the potential of this emerging low-cost data source. In the context of trip generation, a hybrid modelling framework has been developed to account for the anonymous nature of CDR data. This involves fusing the CDR and demographic data of a sub-sample of the users to estimate a demographic prediction sub-model based on phone usage variables extracted from the data. The demographic group membership probabilities from this model are then used as class weights in a latent class model for trip generation based on trip rates extracted from the GSM data of the same users. Once estimated, the hybrid model can be applied to probabilistically infer the socio-demographics, and subsequently, the trip generation of a large proportion of the population where only large-scale anonymous CDR data is available as an input. The estimation and validation results using data from Switzerland show that the hybrid model competes well against a typical trip generation model estimated using data with known socio-demographics of the users. The hybrid framework can be applied to other travel behaviour modelling contexts using CDR data (in mode or route choice for instance). The potential of CDR data to capture rational route choice behaviour for long-distance inter-regional O-D pairs (joined by highly overlapping routes) is demonstrated through data fusion with information on the attributes of the alternatives extracted from multiple external sources. The effect of location discontinuities in CDR data (due to its event-driven nature), and how this impacts the ability to observe the usersā€™ trajectories in a highly overlapping network is discussed prompting the development of a route identification algorithm that distinguishes between unique and broad sub-group route choices. The broad choice framework, which was developed in the context of vehicle type choice is then adapted to leverage this limitation where unique route choices cannot be observed for some users, and only the broad sub-groups of the possible overlapping routes are identifiable. The estimation and validation results using data from Senegal show that CDR data can capture rational route choice behaviour, as well as reasonable value of travel time estimates. Still relying on data fusion, a novel method based on the mixed logit framework is developed to enable the analysis of departure time choice behaviour using passively collected data (GSM and GPS data) where the challenge is to deal with the lack of information on the desired times of travel. The proposed method relies on data fusion with travel time information extracted from Google Maps in the context of Switzerland. It is unique in the sense that it allows the modeller to understand the sensitivity attached to schedule delay, thus enabling its valuation, despite the passive nature of the data. The model results are in line with the expected travel behaviour, and the schedule delay valuation estimates are reasonable for the study area. Finally, a joint trip generation modelling framework fusing CDR, household travel survey, and census data is developed. The framework adjusts the scaling factors of a traditional trip generation model (based on household travel survey data only) to optimise model performance at both the disaggregate and aggregate levels. The framework is calibrated using data from Bangladesh and the adjusted models are found to have better spatial and temporal transferability. Thus, besides demonstrating the potential of mobile phone data, the thesis makes significant methodological and applied contributions. The use of different datasets provides rich insights that can inform policy measures related to the adoption of big data for transport studies. The research findings are particularly timely for transport agencies and practitioners working in contexts with severe data limitations (especially in developing countries), as well as academics generally interested in exploring the potential of emerging big data sources, both in transport and beyond

    The Possibility of Big Data Spatio-Temporal Analytics for Understanding Human Behavior and Their Spatial Patterns in Urban Area

    Get PDF
    13301ē”²ē¬¬4630号博士ļ¼ˆå­¦č”“ļ¼‰é‡‘ę²¢å¤§å­¦åšå£«č«–ę–‡ęœ¬ę–‡Ful

    Designing an On-Demand Dynamic Crowdshipping Model and Evaluating its Ability to Serve Local Retail Delivery in New York City

    Full text link
    Nowadays city mobility is challenging, mainly in populated metropolitan areas. Growing commute demands, increase in the number of for-hire vehicles, enormous escalation in several intra-city deliveries and limited infrastructure (road capacities), all contribute to mobility challenges. These challenges typically have significant impacts on residentsā€™ quality-of-life particularly from an economic and environmental perspective. Decision-makers have to optimize transportation resources to minimize the system externalities (especially in large-scale metropolitan areas). This thesis focus on the intra-city mobility problems experienced by travelers (in the form of congestion and imbalance taxi resources) and businesses (in the form of last-mile delivery), while taking into consideration a measurement of potential adoption by citizens (in the form of a survey). To find solutions for this mobility problem this dissertation proposes three distinct and complementary methodological studies. First, taxi demand is predicted by employing a deep learning approach that leverages Long Short-Term Memory (LSTM) neural networks, trained over publicly available New York City taxi trip data. Taxi pickup data are binned based on geospatial and temporal informational tags, which are then clustered using a technique inspired by Principal Component Analysis. The spatiotemporal distribution of the taxi pickup demand is studied within short-term periods (for the next hour) as well as long-term periods (for the next 48 hours) within each data cluster. The performance and robustness of the LSTM model are evaluated through a comparison with Adaptive Boosting Regression and Decision Tree Regression models fitted to the same datasets. On the next study, an On-Demand Dynamic Crowdshipping system is designed to utilize excess transport capacity to serve parcel delivery tasks and passengers collectively. This method is general and could be expanded and used for all types of public transportation modes depending upon the availability of data. This system is evaluated for the case study of New York City and to assess the impacts of the crowdshipping system (by using taxis as carriers) on trip cost, vehicle miles traveled, and people travel behavior. Finally, a Stated Preference (SP) survey is presented, designed to collect information about peopleā€™s willingness to participate in a crowdshipping system. The survey is analyzed to determine the essential attributes and evaluate the likelihood of individuals participating in the service either as requesters or as carriers. The survey collects information on the preferences and important attributes of New York citizens, describing what segments of the population are willing to participate in a crowdshipping system. While the transportation problems are complex and approximations had to be done within the studies to achieve progress, this dissertation provides a comprehensive way to model and understand the potential impact of efficient utilization of existing resources on transportation systems. Generally, this study offer insights to decisions makers and academics about potential areas of opportunity and methodologies to optimize the transportation system of densely populated areas. This dissertation offers methods that can optimize taxi distribution based on the demand, optimize costs for retail delivery, while providing additional income for individuals. It also provides valuable insights for decision makers in terms of collecting population opinion about the service and analyzing the likelihood of participating in the service. The analysis provides an initial foundation for future modeling and assessment of crowdshipping
    • ā€¦
    corecore