120 research outputs found

    A Data-driven Methodology Towards Mobility- and Traffic-related Big Spatiotemporal Data Frameworks

    Get PDF
    Human population is increasing at unprecedented rates, particularly in urban areas. This increase, along with the rise of a more economically empowered middle class, brings new and complex challenges to the mobility of people within urban areas. To tackle such challenges, transportation and mobility authorities and operators are trying to adopt innovative Big Data-driven Mobility- and Traffic-related solutions. Such solutions will help decision-making processes that aim to ease the load on an already overloaded transport infrastructure. The information collected from day-to-day mobility and traffic can help to mitigate some of such mobility challenges in urban areas. Road infrastructure and traffic management operators (RITMOs) face several limitations to effectively extract value from the exponentially growing volumes of mobility- and traffic-related Big Spatiotemporal Data (MobiTrafficBD) that are being acquired and gathered. Research about the topics of Big Data, Spatiotemporal Data and specially MobiTrafficBD is scattered, and existing literature does not offer a concrete, common methodological approach to setup, configure, deploy and use a complete Big Data-based framework to manage the lifecycle of mobility-related spatiotemporal data, mainly focused on geo-referenced time series (GRTS) and spatiotemporal events (ST Events), extract value from it and support decision-making processes of RITMOs. This doctoral thesis proposes a data-driven, prescriptive methodological approach towards the design, development and deployment of MobiTrafficBD Frameworks focused on GRTS and ST Events. Besides a thorough literature review on Spatiotemporal Data, Big Data and the merging of these two fields through MobiTraffiBD, the methodological approach comprises a set of general characteristics, technical requirements, logical components, data flows and technological infrastructure models, as well as guidelines and best practices that aim to guide researchers, practitioners and stakeholders, such as RITMOs, throughout the design, development and deployment phases of any MobiTrafficBD Framework. This work is intended to be a supporting methodological guide, based on widely used Reference Architectures and guidelines for Big Data, but enriched with inherent characteristics and concerns brought about by Big Spatiotemporal Data, such as in the case of GRTS and ST Events. The proposed methodology was evaluated and demonstrated in various real-world use cases that deployed MobiTrafficBD-based Data Management, Processing, Analytics and Visualisation methods, tools and technologies, under the umbrella of several research projects funded by the European Commission and the Portuguese Government.A população humana cresce a um ritmo sem precedentes, particularmente nas áreas urbanas. Este aumento, aliado ao robustecimento de uma classe média com maior poder económico, introduzem novos e complexos desafios na mobilidade de pessoas em áreas urbanas. Para abordar estes desafios, autoridades e operadores de transportes e mobilidade estão a adotar soluções inovadoras no domínio dos sistemas de Dados em Larga Escala nos domínios da Mobilidade e Tráfego. Estas soluções irão apoiar os processos de decisão com o intuito de libertar uma infraestrutura de estradas e transportes já sobrecarregada. A informação colecionada da mobilidade diária e da utilização da infraestrutura de estradas pode ajudar na mitigação de alguns dos desafios da mobilidade urbana. Os operadores de gestão de trânsito e de infraestruturas de estradas (em inglês, road infrastructure and traffic management operators — RITMOs) estão limitados no que toca a extrair valor de um sempre crescente volume de Dados Espaciotemporais em Larga Escala no domínio da Mobilidade e Tráfego (em inglês, Mobility- and Traffic-related Big Spatiotemporal Data —MobiTrafficBD) que estão a ser colecionados e recolhidos. Os trabalhos de investigação sobre os tópicos de Big Data, Dados Espaciotemporais e, especialmente, de MobiTrafficBD, estão dispersos, e a literatura existente não oferece uma metodologia comum e concreta para preparar, configurar, implementar e usar uma plataforma (framework) baseada em tecnologias Big Data para gerir o ciclo de vida de dados espaciotemporais em larga escala, com ênfase nas série temporais georreferenciadas (em inglês, geo-referenced time series — GRTS) e eventos espacio- temporais (em inglês, spatiotemporal events — ST Events), extrair valor destes dados e apoiar os RITMOs nos seus processos de decisão. Esta dissertação doutoral propõe uma metodologia prescritiva orientada a dados, para o design, desenvolvimento e implementação de plataformas de MobiTrafficBD, focadas em GRTS e ST Events. Além de uma revisão de literatura completa nas áreas de Dados Espaciotemporais, Big Data e na junção destas áreas através do conceito de MobiTrafficBD, a metodologia proposta contem um conjunto de características gerais, requisitos técnicos, componentes lógicos, fluxos de dados e modelos de infraestrutura tecnológica, bem como diretrizes e boas práticas para investigadores, profissionais e outras partes interessadas, como RITMOs, com o objetivo de guiá-los pelas fases de design, desenvolvimento e implementação de qualquer pla- taforma MobiTrafficBD. Este trabalho deve ser visto como um guia metodológico de suporte, baseado em Arqui- teturas de Referência e diretrizes amplamente utilizadas, mas enriquecido com as característi- cas e assuntos implícitos relacionados com Dados Espaciotemporais em Larga Escala, como no caso de GRTS e ST Events. A metodologia proposta foi avaliada e demonstrada em vários cenários reais no âmbito de projetos de investigação financiados pela Comissão Europeia e pelo Governo português, nos quais foram implementados métodos, ferramentas e tecnologias nas áreas de Gestão de Dados, Processamento de Dados e Ciência e Visualização de Dados em plataformas MobiTrafficB

    Spatial big data and moving objects: a comprehensive survey

    Get PDF

    AN ADAPTIVE FRAMEWORK FOR REAL-TIME SPATIOTEMPORAL BIG DATA ANALYTICS

    Get PDF
    Due to advancements in and widespread usage of technologies such as smartphones, satellites, smart sensors, and social networks, collection of spatiotemporal data is growing rapidly. Such massive spatiotemporal data require appropriate techniques and technologies for their efficient analysis and processing. Analyzing massive spatiotemporal data efficiently and effectively is challenging since the data changes dynamically over space and time whereas, often, decisions followed by the analysis need to be made under real-time constraints. Compared to non-spatial data, spatiotemporal data, among other unique characteristics, are multidimensional (x, y, attributes, time) in nature, complex in structures and behaviors, and provides details at different resolutions and scales. These characteristics together make analyzing and processing massive spatiotemporal data in real time a challenging task. Resorting to high-performance computing (HPC) is a common approach for handling this computing challenge but to determine optimal solutions through data and computation analysis, appropriate analytics and computing solutions are needed. In this dissertation, we proposed a framework which is basically a platform providing spatiotemporal data-intensive analytics for data- and compute-intensive applications that require computation under real-time constraints on given computing resources. The framework is a layered structure consisting of four interrelated components (layers); three on analytics and one on adaptive computing. A graph-based approach is developed as the foundation of the analytics components which are: efficient analytics – providing acceptable solutions based on current data in the absence of historical data; predictive analytics – providing near-optimal solutions by learning from the patterns of historical data and predicting based on the learning; meta-analytics – providing optimal solutions by analyzing pattern of past data patterns; and adaptive computing that ensures appropriate analytics are applied and computation is completed in real time on available computing resources

    Colossal Trajectory Mining: A unifying approach to mine behavioral mobility patterns

    Get PDF
    Spatio-temporal mobility patterns are at the core of strategic applications such as urban planning and monitoring. Depending on the strength of spatio-temporal constraints, different mobility patterns can be defined. While existing approaches work well in the extraction of groups of objects sharing fine-grained paths, the huge volume of large-scale data asks for coarse-grained solutions. In this paper, we introduce Colossal Trajectory Mining (CTM) to efficiently extract heterogeneous mobility patterns out of a multidimensional space that, along with space and time dimensions, can consider additional trajectory features (e.g., means of transport or activity) to characterize behavioral mobility patterns. The algorithm is natively designed in a distributed fashion, and the experimental evaluation shows its scalability with respect to the involved features and the cardinality of the trajectory dataset

    A distributed workload-aware approach to partitioning geospatial big data for cybergis analytics

    Get PDF
    Numerous applications and scientific domains have contributed to tremendous growth of geospatial data during the past several decades. To resolve the volume and velocity of such big data, distributed system approaches have been extensively studied to partition data for scalable analytics and associated applications. However, previous work on partitioning large geospatial data focuses on bulk-ingestion and static partitioning, hence is unable to handle dynamic variability in both data and computation that are particularly common for streaming data. To eliminate this limitation, this thesis holistically addresses computational intensity and dynamic data workload to achieve optimal data partitioning for scalable geospatial applications. Specifically, novel data partitioning algorithms have been developed to support scalable geospatial and temporal data management with new data models designed to represent dynamic data workload. Optimal partitions are realized by formulating a fine-grain spatial optimization problem that is solved using an evolutionary algorithm with spatially explicit operations. As an overarching approach to integrating the algorithms, data models and spatial optimization problem solving, GeoBalance is established as a workload-aware framework for supporting scalable cyberGIS (i.e. geographic information science and systems based on advanced cyberinfrastructure) analytics

    Capturing time in space : Dynamic analysis of accessibility and mobility to support spatial planning with open data and tools

    Get PDF
    Understanding the spatial patterns of accessibility and mobility are a key (factor) to comprehend the functioning of our societies. Hence, their analysis has become increasingly important for both scientific research and spatial planning. Spatial accessibility and mobility are closely related concepts, as accessibility describes the potential to move by modeling, whereas spatial mobility describes the realized movements of individuals. While both spatial accessibility and mobility have been widely studied, the understanding of how time and temporal change affects accessibility and mobility has been rather limited this far. In the era of ‘big data’, the wealth of temporally sensitive spatial data has made it possible, better than ever, to capture and understand the temporal realities of spatial accessibility and mobility, and hence start to understand better the dynamics of our societies and complex living environment. In this thesis, I aim to develop novel approaches and methods to study the spatio-temporal realities of our living environments via concepts of accessibility and mobility: How people can access places, how they actually move, and how they use space. I inspect these dynamics on several temporal granularities, covering hourly, daily, monthly, and yearly observations and analyses. With novel big data sources, the methodological development and careful assessment of the information extracted from them is extremely important as they are increasingly used to guide decision-making. Hence, I investigate the opportunities and pitfalls of different data sources and methodological approaches in this work. Contextually, I aim to reveal the role of time and the mode of transportation in relation to spatial accessibility and mobility, in both urban and rural environments, and discuss their role in spatial planning. I base my findings on five scientific articles on studies carried out in: Peruvian Amazonia; national parks of South Africa and Finland; Tallinn, Estonia; and Helsinki metropolitan area, Finland. I use and combine data from various sources to extract knowledge from them, including GPS devices; transportation schedules; mobile phones; social media; statistics; land-use data; and surveys. My results demonstrate that spatial accessibility and mobility are highly dependent on time, having clear diurnal and seasonal changes. Hence, it is important to consider temporality when analyzing accessibility, as people, transport and activities all fluctuate as a function of time that affects e.g. the spatial equality of reaching services. In addition, different transport modes should be considered as there are clear differences between them. Furthermore, I show that, in addition to the observed spatial population dynamics, also nature’s own dynamism affects accessibility and mobility on a regional level due to the seasonal variation in river-levels. Also, the visitation patterns in national parks vary significantly over time, as can be observed from social media. Methodologically, this work demonstrates that with a sophisticated fusion of methods and data, it is possible to assess; enrich; harmonize; and increase the spatial and temporal accuracy of data that can be used to better inform spatial planning and decision-making. Finally, I wish to emphasize the importance of bringing scientific knowledge and tools into practice. Hence, all the tools, analytical workflows, and data are openly available for everyone whenever possible. This approach has helped to bring the knowledge and tools into practice with relevant stakeholders in relation to spatial planning

    Mixed Spatial and Nonspatial Problems in Location Based Services

    Get PDF
    With hundreds of millions of users reporting locations and embracing mobile technologies, Location Based Services (LBSs) are raising new challenges. In this dissertation, we address three emerging problems in location services, where geolocation data plays a central role. First, to handle the unprecedented growth of generated geolocation data, existing location services rely on geospatial database systems. However, their inability to leverage combined geographical and textual information in analytical queries (e.g. spatial similarity joins) remains an open problem. To address this, we introduce SpsJoin, a framework for computing spatial set-similarity joins. SpsJoin handles combined similarity queries that involve textual and spatial constraints simultaneously. LBSs use this system to tackle different types of problems, such as deduplication, geolocation enhancement and record linkage. We define the spatial set-similarity join problem in a general case and propose an algorithm for its efficient computation. Our solution utilizes parallel computing with MapReduce to handle scalability issues in large geospatial databases. Second, applications that use geolocation data are seldom concerned with ensuring the privacy of participating users. To motivate participation and address privacy concerns, we propose iSafe, a privacy preserving algorithm for computing safety snapshots of co-located mobile devices as well as geosocial network users. iSafe combines geolocation data extracted from crime datasets and geosocial networks such as Yelp. In order to enhance iSafe\u27s ability to compute safety recommendations, even when crime information is incomplete or sparse, we need to identify relationships between Yelp venues and crime indices at their locations. To achieve this, we use SpsJoin on two datasets (Yelp venues and geolocated businesses) to find venues that have not been reviewed and to further compute the crime indices of their locations. Our results show a statistically significant dependence between location crime indices and Yelp features. Third, review centered LBSs (e.g., Yelp) are increasingly becoming targets of malicious campaigns that aim to bias the public image of represented businesses. Although Yelp actively attempts to detect and filter fraudulent reviews, our experiments showed that Yelp is still vulnerable. Fraudulent LBS information also impacts the ability of iSafe to provide correct safety values. We take steps toward addressing this problem by proposing SpiDeR, an algorithm that takes advantage of the richness of information available in Yelp to detect abnormal review patterns. We propose a fake venue detection solution that applies SpsJoin on Yelp and U.S. housing datasets. We validate the proposed solutions using ground truth data extracted by our experiments and reviews filtered by Yelp
    • …
    corecore