870 research outputs found

    Knowledge discovery from trajectories

    Get PDF
    Dissertation submitted in partial fulfilment of the requirements for the Degree of Master of Science in Geospatial TechnologiesAs a newly proliferating study area, knowledge discovery from trajectories has attracted more and more researchers from different background. However, there is, until now, no theoretical framework for researchers gaining a systematic view of the researches going on. The complexity of spatial and temporal information along with their combination is producing numerous spatio-temporal patterns. In addition, it is very probable that a pattern may have different definition and mining methodology for researchers from different background, such as Geographic Information Science, Data Mining, Database, and Computational Geometry. How to systematically define these patterns, so that the whole community can make better use of previous research? This paper is trying to tackle with this challenge by three steps. First, the input trajectory data is classified; second, taxonomy of spatio-temporal patterns is developed from data mining point of view; lastly, the spatio-temporal patterns appeared on the previous publications are discussed and put into the theoretical framework. In this way, researchers can easily find needed methodology to mining specific pattern in this framework; also the algorithms needing to be developed can be identified for further research. Under the guidance of this framework, an application to a real data set from Starkey Project is performed. Two questions are answers by applying data mining algorithms. First is where the elks would like to stay in the whole range, and the second is whether there are corridors among these regions of interest

    An Investigation in Efficient Spatial Patterns Mining

    Get PDF
    The technical progress in computerized spatial data acquisition and storage results in the growth of vast spatial databases. Faced with large amounts of increasing spatial data, a terminal user has more difficulty in understanding them without the helpful knowledge from spatial databases. Thus, spatial data mining has been brought under the umbrella of data mining and is attracting more attention. Spatial data mining presents challenges. Differing from usual data, spatial data includes not only positional data and attribute data, but also spatial relationships among spatial events. Further, the instances of spatial events are embedded in a continuous space and share a variety of spatial relationships, so the mining of spatial patterns demands new techniques. In this thesis, several contributions were made. Some new techniques were proposed, i.e., fuzzy co-location mining, CPI-tree (Co-location Pattern Instance Tree), maximal co-location patterns mining, AOI-ags (Attribute-Oriented Induction based on Attributes’ Generalization Sequences), and fuzzy association prediction. Three algorithms were put forward on co-location patterns mining: the fuzzy co-location mining algorithm, the CPI-tree based co-location mining algorithm (CPI-tree algorithm) and the orderclique- based maximal prevalence co-location mining algorithm (order-clique-based algorithm). An attribute-oriented induction algorithm based on attributes’ generalization sequences (AOI-ags algorithm) is further given, which unified the attribute thresholds and the tuple thresholds. On the two real-world databases with time-series data, a fuzzy association prediction algorithm is designed. Also a cell-based spatial object fusion algorithm is proposed. Two fuzzy clustering methods using domain knowledge were proposed: Natural Method and Graph-Based Method, both of which were controlled by a threshold. The threshold was confirmed by polynomial regression. Finally, a prototype system on spatial co-location patterns’ mining was developed, and shows the relative efficiencies of the co-location techniques proposed The techniques presented in the thesis focus on improving the feasibility, usefulness, effectiveness, and scalability of related algorithm. In the design of fuzzy co-location Abstract mining algorithm, a new data structure, the binary partition tree, used to improve the process of fuzzy equivalence partitioning, was proposed. A prefix-based approach to partition the prevalent event set search space into subsets, where each sub-problem can be solved in main-memory, was also presented. The scalability of CPI-tree algorithm is guaranteed since it does not require expensive spatial joins or instance joins for identifying co-location table instances. In the order-clique-based algorithm, the co-location table instances do not need be stored after computing the Pi value of corresponding colocation, which dramatically reduces the executive time and space of mining maximal colocations. Some technologies, for example, partitions, equivalence partition trees, prune optimization strategies and interestingness, were used to improve the efficiency of the AOI-ags algorithm. To implement the fuzzy association prediction algorithm, the “growing window” and the proximity computation pruning were introduced to reduce both I/O and CPU costs in computing the fuzzy semantic proximity between time-series. For new techniques and algorithms, theoretical analysis and experimental results on synthetic data sets and real-world datasets were presented and discussed in the thesis

    An investigation in efficient spatial patterns mining

    Get PDF
    The technical progress in computerized spatial data acquisition and storage results in the growth of vast spatial databases. Faced with large amounts of increasing spatial data, a terminal user has more difficulty in understanding them without the helpful knowledge from spatial databases. Thus, spatial data mining has been brought under the umbrella of data mining and is attracting more attention. Spatial data mining presents challenges. Differing from usual data, spatial data includes not only positional data and attribute data, but also spatial relationships among spatial events. Further, the instances of spatial events are embedded in a continuous space and share a variety of spatial relationships, so the mining of spatial patterns demands new techniques. In this thesis, several contributions were made. Some new techniques were proposed, i.e., fuzzy co-location mining, CPI-tree (Co-location Pattern Instance Tree), maximal co-location patterns mining, AOI-ags (Attribute-Oriented Induction based on Attributes’ Generalization Sequences), and fuzzy association prediction. Three algorithms were put forward on co-location patterns mining: the fuzzy co-location mining algorithm, the CPI-tree based co-location mining algorithm (CPI-tree algorithm) and the orderclique- based maximal prevalence co-location mining algorithm (order-clique-based algorithm). An attribute-oriented induction algorithm based on attributes’ generalization sequences (AOI-ags algorithm) is further given, which unified the attribute thresholds and the tuple thresholds. On the two real-world databases with time-series data, a fuzzy association prediction algorithm is designed. Also a cell-based spatial object fusion algorithm is proposed. Two fuzzy clustering methods using domain knowledge were proposed: Natural Method and Graph-Based Method, both of which were controlled by a threshold. The threshold was confirmed by polynomial regression. Finally, a prototype system on spatial co-location patterns’ mining was developed, and shows the relative efficiencies of the co-location techniques proposed The techniques presented in the thesis focus on improving the feasibility, usefulness, effectiveness, and scalability of related algorithm. In the design of fuzzy co-location Abstract mining algorithm, a new data structure, the binary partition tree, used to improve the process of fuzzy equivalence partitioning, was proposed. A prefix-based approach to partition the prevalent event set search space into subsets, where each sub-problem can be solved in main-memory, was also presented. The scalability of CPI-tree algorithm is guaranteed since it does not require expensive spatial joins or instance joins for identifying co-location table instances. In the order-clique-based algorithm, the co-location table instances do not need be stored after computing the Pi value of corresponding colocation, which dramatically reduces the executive time and space of mining maximal colocations. Some technologies, for example, partitions, equivalence partition trees, prune optimization strategies and interestingness, were used to improve the efficiency of the AOI-ags algorithm. To implement the fuzzy association prediction algorithm, the “growing window” and the proximity computation pruning were introduced to reduce both I/O and CPU costs in computing the fuzzy semantic proximity between time-series. For new techniques and algorithms, theoretical analysis and experimental results on synthetic data sets and real-world datasets were presented and discussed in the thesis.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    An investigation in efficient spatial patterns mining

    Get PDF
    The technical progress in computerized spatial data acquisition and storage results in the growth of vast spatial databases. Faced with large amounts of increasing spatial data, a terminal user has more difficulty in understanding them without the helpful knowledge from spatial databases. Thus, spatial data mining has been brought under the umbrella of data mining and is attracting more attention. Spatial data mining presents challenges. Differing from usual data, spatial data includes not only positional data and attribute data, but also spatial relationships among spatial events. Further, the instances of spatial events are embedded in a continuous space and share a variety of spatial relationships, so the mining of spatial patterns demands new techniques. In this thesis, several contributions were made. Some new techniques were proposed, i.e., fuzzy co-location mining, CPI-tree (Co-location Pattern Instance Tree), maximal co-location patterns mining, AOI-ags (Attribute-Oriented Induction based on Attributes’ Generalization Sequences), and fuzzy association prediction. Three algorithms were put forward on co-location patterns mining: the fuzzy co-location mining algorithm, the CPI-tree based co-location mining algorithm (CPI-tree algorithm) and the orderclique- based maximal prevalence co-location mining algorithm (order-clique-based algorithm). An attribute-oriented induction algorithm based on attributes’ generalization sequences (AOI-ags algorithm) is further given, which unified the attribute thresholds and the tuple thresholds. On the two real-world databases with time-series data, a fuzzy association prediction algorithm is designed. Also a cell-based spatial object fusion algorithm is proposed. Two fuzzy clustering methods using domain knowledge were proposed: Natural Method and Graph-Based Method, both of which were controlled by a threshold. The threshold was confirmed by polynomial regression. Finally, a prototype system on spatial co-location patterns’ mining was developed, and shows the relative efficiencies of the co-location techniques proposed The techniques presented in the thesis focus on improving the feasibility, usefulness, effectiveness, and scalability of related algorithm. In the design of fuzzy co-location Abstract mining algorithm, a new data structure, the binary partition tree, used to improve the process of fuzzy equivalence partitioning, was proposed. A prefix-based approach to partition the prevalent event set search space into subsets, where each sub-problem can be solved in main-memory, was also presented. The scalability of CPI-tree algorithm is guaranteed since it does not require expensive spatial joins or instance joins for identifying co-location table instances. In the order-clique-based algorithm, the co-location table instances do not need be stored after computing the Pi value of corresponding colocation, which dramatically reduces the executive time and space of mining maximal colocations. Some technologies, for example, partitions, equivalence partition trees, prune optimization strategies and interestingness, were used to improve the efficiency of the AOI-ags algorithm. To implement the fuzzy association prediction algorithm, the “growing window” and the proximity computation pruning were introduced to reduce both I/O and CPU costs in computing the fuzzy semantic proximity between time-series. For new techniques and algorithms, theoretical analysis and experimental results on synthetic data sets and real-world datasets were presented and discussed in the thesis.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    Spatiotemporal Big Data Analytics for Future Mobility

    Get PDF
    University of Minnesota Ph.D. dissertation. May 2019. Major: Computer Science. Advisor: Shashi Shekhar. 1 computer file (PDF); xii, 161 pages.Recent years have witnessed the explosion of spatiotemporal big data (e.g. GPS trajectories, vehicle engine measurements, remote sensing imagery, and geotagged tweets) which has a potential to transform our societies. Terabytes of earth observation data are collected every day from thousands of places across the world. Modern vehicles are increasingly equipped with rich sensors that measure hundreds of engine variables (e.g., emissions, fuel consumption, speed, etc) annotated with timestamps and location data for every second of the vehicle’s trip. According to reports by McKinsey and Cisco, leveraging such data is potentially worth hundreds of billions of dollars annually in fuel savings. Spatiotemporal big data are also enabling many modern technologies such as on-demand transportation (e.g. Uber, Lyft). Today, the on-demand economy attracts millions of consumers annually and over $50 billion in spending. Even more growth is expected with the emergence of self-driving cars. However, spatiotemporal big data are of volume, velocity, variety, and veracity that exceed the capability of common spatiotemporal data analytic techniques. My thesis investigates spatiotemporal big data analytics that address the volume and velocity challenges of spatiotemporal big data in the context of novel applications in transportation and engine science, future mobility, and the on-demand economy. The thesis proposes scalable algorithms for mining “Non-compliant Window Co-occurrence Patterns”, which allow the discovery of correlations in spatiotemporal big data with a large number of variables. Novel upper bounds were introduced for a statistical interest measure of association to efficiently prune uninteresting candidate patterns. Case studies with real world engine data demonstrated the ability of the proposed approaches to discover patterns which are of interest to engine scientists. To address the high velocity challenge, the thesis explored online optimization heuristics for matching supply and demand in an on-demand spatial service broker. The proposed algorithms maximize the matching size while also maintaining a balanced provider utilization to ensure robustness against variations in the supply-demand ratio and that providers do not drop out. Proposed algorithms were shown to outperform related work on multiple performance measures. In addition, the thesis proposed a scalable matching and scheduling algorithm for an on-demand pickup and delivery broker for moving consumers with multiple candidate delivery locations and time intervals. Extensive evaluation showed that the proposed approach yields significant computational savings without sacrificing the solution quality

    The Interconnected Arctic — UArctic Congress 2016

    Get PDF
    climate change; Arctic; vulnerability; environment; marine and terrestrial polar landscapes; indigenous knowledge; touris

    Adjoint Modeling and Observing System Design in the Subpolar North Atlantic

    Get PDF
    The near-surface ocean currents of the subpolar North Atlantic transport large amounts of heat from the subtropics to higher latitudes, affecting Arctic sea ice extent, the melting of the Greenland Ice Sheet, and the climate in western Europe and North America. Moreover, deep water formation in the subpolar North Atlantic actively shapes the Atlantic meridional overturning circulation, which connects the surface with the deep ocean and the northern with the southern hemisphere. The recently acquired data from the OSNAP (Overturning in the Subpolar North Atlantic Program) mooring array challenges our understanding of the processes that govern circulation and deep water formation in the subpolar North Atlantic. However, only long-term and sustained ocean observations can provide the much-needed benchmark to evaluate climate model simulations, to advance our understanding of key mechanisms, and to predict the role of the North Atlantic in future climate changes and anthropogenic carbon uptake. Unfortunately, most observational efforts rely on short-term funding periods. Given the cost of deploying and maintaining ocean observing systems, these systems have to be designed carefully. Key questions are: What information is contained in already existing observation networks? What do existing networks, such as the OSNAP array, tell us about hydrographic and circulation quantities in remote oceanic regions with few observations? In this thesis, a novel approach to ocean observing system design is explored that is able to address these questions. The approach makes use of adjoint modeling and Hessian-based Uncertainty Quantification (UQ) within a global oceanographic inverse problem. Adjoint-derived sensitivities reveal that the eastern boundary of the North Atlantic and the coasts of Iceland and Greenland are important pathways for communicating wind-driven pressure anomalies around the entire subpolar North Atlantic and the Nordic Seas. Consequently, the OSNAP observing array shares many dynamical pathways and mechanisms with oceanic quantities that are remote from the array. The OSNAP array has therefore potential to inform these unobserved - or unobservable - quantities: for instance, ocean heat content in the Nordic Seas or close to Greenland’s margins. In this thesis, this potential is quantified within the state-of-the-art ECCO (Estimating the Circulation and Climate of the Ocean) state estimation framework, by combining physical relationships in the model with prior information and data uncertainties. The effectiveness of an observing system is determined by how well it captures climate-relevant signals and important dynamical adjustment mechanisms. A second important factor, however, is how strongly the monitored signals are masked by noise. All factors combined, heat transport measurements across the OSNAP-West transect, extending from Labrador to South Greenland, impose an overall much stronger constraint on the ECCO state estimate than heat transport measurements across the OSNAP-East transect, extending from South Greenland to Scotland. This is largely explained by the fact that climate signals detected by OSNAP-West are less noisy compared to climate signals detected by OSNAP-East. As a result, transport and hydrographic quantities - even in the Nordic Seas - are constrained more efficiently by OSNAP-West than OSNAP-East observations, contrary to recent findings. This suggests that OSNAP-West is important for informing remote climate signals. This thesis explores the physical mechanisms that link the subpolar North Atlantic and the Nordic Seas, translates the mathematical concepts that underlie Hessian-based UQ to dynamical concepts, and discusses benefits, shortcomings, and future challenges for designing an effective, long-term Atlantic observing system by means of UQ within ocean state estimation.Doktorgradsavhandlin

    Statistical physics approaches to the complex Earth system

    Get PDF
    Global warming, extreme climate events, earthquakes and their accompanying socioeconomic disasters pose significant risks to humanity. Yet due to the nonlinear feedbacks, multiple interactions and complex structures of the Earth system, the understanding and, in particular, the prediction of such disruptive events represent formidable challenges to both scientific and policy communities. During the past years, the emergence and evolution of Earth system science has attracted much attention and produced new concepts and frameworks. Especially, novel statistical physics and complex networks-based techniques have been developed and implemented to substantially advance our knowledge of the Earth system, including climate extreme events, earthquakes and geological relief features, leading to substantially improved predictive performances. We present here a comprehensive review on the recent scientific progress in the development and application of how combined statistical physics and complex systems science approaches such as critical phenomena, network theory, percolation, tipping points analysis, and entropy can be applied to complex Earth systems. Notably, these integrating tools and approaches provide new insights and perspectives for understanding the dynamics of the Earth systems. The overall aim of this review is to offer readers the knowledge on how statistical physics concepts and theories can be useful in the field of Earth system science

    Statistical physics approaches to the complex Earth system

    Get PDF
    Global climate change, extreme climate events, earthquakes and their accompanying natural disasters pose significant risks to humanity. Yet due to the nonlinear feedbacks, strategic interactions and complex structure of the Earth system, the understanding and in particular the predicting of such disruptive events represent formidable challenges for both scientific and policy communities. During the past years, the emergence and evolution of Earth system science has attracted much attention and produced new concepts and frameworks. Especially, novel statistical physics and complex networks-based techniques have been developed and implemented to substantially advance our knowledge for a better understanding of the Earth system, including climate extreme events, earthquakes and Earth geometric relief features, leading to substantially improved predictive performances. We present here a comprehensive review on the recent scientific progress in the development and application of how combined statistical physics and complex systems science approaches such as, critical phenomena, network theory, percolation, tipping points analysis, as well as entropy can be applied to complex Earth systems (climate, earthquakes, etc.). Notably, these integrating tools and approaches provide new insights and perspectives for understanding the dynamics of the Earth systems. The overall aim of this review is to offer readers the knowledge on how statistical physics approaches can be useful in the field of Earth system science
    • 

    corecore