16 research outputs found

    Inferring Unusual Crowd Events From Mobile Phone Call Detail Records

    Full text link
    The pervasiveness and availability of mobile phone data offer the opportunity of discovering usable knowledge about crowd behaviors in urban environments. Cities can leverage such knowledge in order to provide better services (e.g., public transport planning, optimized resource allocation) and safer cities. Call Detail Record (CDR) data represents a practical data source to detect and monitor unusual events considering the high level of mobile phone penetration, compared with GPS equipped and open devices. In this paper, we provide a methodology that is able to detect unusual events from CDR data that typically has low accuracy in terms of space and time resolution. Moreover, we introduce a concept of unusual event that involves a large amount of people who expose an unusual mobility behavior. Our careful consideration of the issues that come from coarse-grained CDR data ultimately leads to a completely general framework that can detect unusual crowd events from CDR data effectively and efficiently. Through extensive experiments on real-world CDR data for a large city in Africa, we demonstrate that our method can detect unusual events with 16% higher recall and over 10 times higher precision, compared to state-of-the-art methods. We implement a visual analytics prototype system to help end users analyze detected unusual crowd events to best suit different application scenarios. To the best of our knowledge, this is the first work on the detection of unusual events from CDR data with considerations of its temporal and spatial sparseness and distinction between user unusual activities and daily routines.Comment: 18 pages, 6 figure

    Coupled IGMM-GANs with Applications to Anomaly Detection in Human Mobility Data

    Get PDF
    Detecting anomalous activity in human mobility data has a number of applications, including road hazard sensing, telematics-based insurance, and fraud detection in taxi services and ride sharing. In this article, we address two challenges that arise in the study of anomalous human trajectories: (1) a lack of ground truth data on what defines an anomaly and (2) the dependence of existing methods on significant pre-processing and feature engineering. Although generative adversarial networks (GANs) seem like a natural fit for addressing these challenges, we find that existing GAN-based anomaly detection algorithms perform poorly due to their inability to handle multimodal patterns. For this purpose, we introduce an infinite Gaussian mixture model coupled with (bidirectional) GANs—IGMM-GAN—that is able to generate synthetic, yet realistic, human mobility data and simultaneously facilitates multimodal anomaly detection. Through the estimation of a generative probability density on the space of human trajectories, we are able to generate realistic synthetic datasets that can be used to benchmark existing anomaly detection methods. The estimated multimodal density also allows for a natural definition of outlier that we use for detecting anomalous trajectories. We illustrate our methodology and its improvement over existing GAN anomaly detection on several human mobility datasets, along with MNIST

    A trajectory outlier detection method based on variational auto-encoder

    Get PDF
    Trajectory outlier detection can identify abnormal phenomena from a large number of trajectory data, which is helpful to discover or predict potential traffic risks. In this work, we proposed a trajectory outlier detection model based on variational auto-encoder. First, the model encodes the trajectory data as parameters of distribution functions based on the statistical characteristics of urban traffic. Then, an auto-encoder network is built and trained. The training goal of the auto-encoder network is to maximize the generation probability of original trajectories when decoding. Once the model training is completed, we can detect the trajectory outlier by the difference between a trajectory and the trajectory generated by the model. The advantage of the proposed model is that it only needs to compute the difference between the original trajectory and the trajectory generated by the model when detecting the trajectory outlier, which greatly reduces the amount of calculation and makes the model very suitable for real-time detection scenarios. In addition, the distance threshold between the abnormal trajectory and the normal trajectory can be set by referring to the proportion of the abnormal trajectory in the training data set, which eliminates the difficulty of setting the threshold manually and makes the model more convenient to be applied in different actual scenes. In terms of effect, the proposed model has achieved more than 95% in accuracy, which is better than the two typical density-based and classification-based detection methods, and also better than the methods based on machine learning in recent years. In terms of efficiency, the model has good convergence in the training phase and the training time increases slowly with the data scale, which is better than or as the same as the comparison methods

    Spatio-temporal pattern mining from global positioning systems (GPS) trajectories dataset

    Get PDF
    Dissertation submitted in partial fulfilment of the requirements for the degree of Master of Science in Geospatial TechnologiesThe increasing frequency of use location-acquisition technology like the Global Positioning System is leading to the collection of large spatio-temporal datasets. The prospect of discovering usable knowledge about movement behavior, which encourages for the discovery of interesting relationships and characteristics users that may exist implicitly in spatial databases. Therefore spatial data mining is emerging as a novel area of research. In this study, the experiments were conducted following the Knowledge Discovery in Database process model. The Knowledge Discovery in Database process model starts from selection of the datasets. The GPS trajectory dataset for this research collected from Microsoft Research Asia Geolife project. After taking the data, it has been preprocessed. The major preprocessing activities include: Fill in missed values and remove outliers; Resolve inconsistencies, integration of data that contains both labeled and unlabeled datasets, Dimensionality reduction, size reduction and data transformation activity like discretization tasks were done for this study. A total of 4,273 trajectory dataset are used for training the models. For validating the performance of the selected model a separate 1,018 records are used as a testing set. For building a spatiotemporal model of this study the K-nearest Neighbors (KNN), decision tree and Bayes algorithms have been tasted as supervised approach. The model that was created using 10-fold cross validation with K value 11 and other default parameter values showed the best classification accuracy. The model has a prediction accuracy of 98.5% on the training datasets and 93.12% on the test dataset to classify the new instances as bike, bus, car, subway, train and walk classes. The findings of this study have shown that the spatiotemporal data mining methods help to classify user mobility transportation modes. Future research directions are forwarded to come up an applicable system in the area of the study

    Quality assessment of OpenStreetMap data using trajectory mining

    Get PDF
    OpenStreetMap (OSM) data are widely used but their reliability is still variable. Many contributors to OSM have not been trained in geography or surveying and consequently their contributions, including geometry and attribute data inserts, deletions, and updates, can be inaccurate, incomplete, inconsistent, or vague. There are some mechanisms and applications dedicated to discovering bugs and errors in OSM data. Such systems can remove errors through user-checks and applying predefined rules but they need an extra control process to check the real-world validity of suspected errors and bugs. This paper focuses on finding bugs and errors based on patterns and rules extracted from the tracking data of users. The underlying idea is that certain characteristics of user trajectories are directly linked to the type of feature. Using such rules, some sets of potential bugs and errors can be identified and stored for further investigation

    Exploring Potentials in Mobile Phone GPS Data Collection and Analysis

    Get PDF
    In order to support efficient transportation planning decisions, household travel survey data with high levels of accuracy are essential. Due to a number of issues associated with conventional household travel surveys, including high cost, low response rate, trip misreporting, and respondents’ self-reporting bias, government and private agencies are desperately searching for alternative data collection methods. Recent advancements in smart phones and Global Positioning System (GPS) technologies present new opportunities to track travelers’ trips. Considering the high penetration rate of smartphones, it seems reasonable to use smartphone data as a reliable source of individual travel diary. Many studies have applied GPS-Based data in planning and demand analysis but mobile phone GPS data has not received much attention. The Google Location History (GLH) data provide an opportunity to explore the potential of these data. This research presents a study using GLH data, including the data processing algorithm in deriving travel information and the potential applications in understanding travel patterns. The main goal of this study is to explore the potential of using cell phone GPS data to advance the understanding in mobility and travel behavior. The objectives of the study include: a) assessing the technical feasibility of using smartphones in transportation planning as a substitute of traditional household survey b) develop algorithms and procedures to derive travel information from smartphones; and c) identify applications in mobility and travel behavior studies that could take advantage of these smartphones GPS data, which would not have been possible with conventional data collection methods. This research aims to demonstrate how accurate travel information can be collected and analyzed with lower cost using smartphone GPS data and what analysis applications can be made possible with this new data source. Moreover, the framework developed in this study can provide valuable insights for others who are interested in using cell phone data. GLH data are obtained from 45 participants in a two-month period for the study. The results show great promise of using GLH data as a supplement or complement to conventional travel diary data. It shows that GLH provides sufficient high resolution data that can be used to study people’s movement without respondent burden, and potentially it can be applied to a large scale study easily. The developed algorithms in this study work well with the data. This study supports that transportation data can be collected with smartphones less expensively and more accurately than by traditional household travel survey. These data provide the opportunity to facilitate the investigation of various issues, such as less frequent long-distance travel, hourly variations in travel behavior, and daily variations in travel behavior

    Big Data for Urban Sustainability: Integrating Personal Mobility Dynamics in Environmental Assessments.

    Full text link
    To alleviate fossil fuel use, reduce air emissions, and mitigate climate change, “new mobility” systems start to emerge with technologies such as electric vehicles, multi-modal transportation enabled by information and communications technology, and car/ride sharing. Current literature on the environmental implications of these emerging systems is often limited by using aggregated travel pattern data to characterize personal mobility dynamics, neglecting the individual heterogeneity. Individual travel patterns affect several key factors that determine potential environmental impacts, including charging behaviors, connection needs between different transportation modes, and car/ride sharing potentials. Therefore, to better understand these systems and inform decision making, travel patterns at the individual level need to be considered. Using vehicle trajectory data of over 10,000 taxis in Beijing, this research demonstrates the benefits of integrating individual travel patterns into environmental assessments through three case studies (vehicle electrification, charging station siting, and ride sharing) focusing on two emerging systems: electric vehicles and ride sharing. Results from the vehicle electrification study indicate that individual travel patterns can impact the environmental performance of fleet electrification. When battery cost exceeds 200/kWh,vehicleswithgreaterbatteryrangecannotcontinuouslyimprovetravelelectrificationandcanreduceelectrificationrate.Atthecurrentbatterycostof200/kWh, vehicles with greater battery range cannot continuously improve travel electrification and can reduce electrification rate. At the current battery cost of 400/kWh, targeting subsidies to vehicles with battery range around 90 miles can achieve higher electrification rate. The public charging station siting case demonstrates that individual travel patterns can better estimate charging demand and guide charging infrastructure development. Charging stations sited according to individual travel patterns can increase electrification rate by 59% to 88% compared to existing sites. Lastly, the ride sharing case shows that trip details extracted from vehicle trajectory data enable dynamic ride sharing modeling. Shared taxi rides in Beijing can reduce total travel distance and air emissions by 33% with 10-minute travel time deviation tolerance. Only minimal tolerance to travel time change (4 minutes) is needed from the riders to enable significant ride sharing (sharing 60% of the trips and saving 20% of travel distance). In summary, vehicle trajectory data can be integrated into environmental assessments to capture individual travel patterns and improve our understanding of the emerging transportation systems.PhDNatural Resources and Environment and Environmental EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/113510/1/caih_1.pd

    Processamento analítico de fluxos de dados de tráfego em tempo quase real

    Get PDF
    Mestrado em Sistemas de InformaçãoNos dias de hoje, as tecnologias com as quais temos contacto geram dados sobre a sua utilização e sobre o utilizador, com uma velocidade e variedade sem precedentes. Cria-se assim a necessidade de gerir os fluxos de dados e de transformar estes dados em informação armazenada de forma estruturada, inferindo sobre a mesma e retirando conclusões. As áreas de aplicação são diversas e uma das vertentes que tem recebido maior atenção é o processamento de dados referentes ao tráfego automóvel obtidos usando dispositivos GPS, que se devidamente tratados permitem dar informação adicional aos utilizadores sobre o estado do trânsito, encontrar os caminhos mais rápidos ou até fazer previsões sobre o tráfego no futuro. O objetivo desta dissertação consiste em implementar um protótipo que consiga fazer o processamento de um fluxo de dados obtidos em tempo real e estruturá-los de forma a dar respostas sobre o estado do tráfego no momento e no futuro próximo. Para conseguir dar estas respostas, serão considerados não só os dados recebidos em tempo real como também informação adquirida anteriormente, de forma a ser possível fazer comparações e tirar conclusões. O protótipo está dividido em três módulos principais: o pré-processamento e a análise de dados históricos; o processamento de dados de tráfego em tempo quase real; e a apresentação de resultados. O protótipo foi sujeito a testes e os seus resultados sujeitos a avaliação de forma a verificar a validade das respostas devolvidas ao utilizador.Nowadays, the technologies we handle generate data about their usage and the user, with an unprecedented rate and variety. This raises the need to manage all the data streams and to transform these data in information. This information is stored in a structured way allowing to infer about it and withdraw conclusions. There is a wide range of application areas, with the car traffic data processing receiving the most attention. These data are obtained from GPS devices and if properly processed, allow the user to have additional information about the traffic status, the faster way to a destination and even predictions on the future traffic status. This dissertation aims to implement a prototype able to process and structure the data streams in real-time, to ultimately present answers about the traffic status at the moment or even in a near future. These answers are obtained not only by the real-time information but also by previously acquired information. Having two sources of information allows to compare and withdraw statistical conclusions. The prototype is divided in three main modules: the pre-processing and analysis of historical data; the processing of traffic data in near real-time; and the results presentation. The prototype was subject to tests and their results subject to evaluation to verify the answers’ assertiveness
    corecore