1,160 research outputs found

    A General Spatio-Temporal Clustering-Based Non-local Formulation for Multiscale Modeling of Compartmentalized Reservoirs

    Full text link
    Representing the reservoir as a network of discrete compartments with neighbor and non-neighbor connections is a fast, yet accurate method for analyzing oil and gas reservoirs. Automatic and rapid detection of coarse-scale compartments with distinct static and dynamic properties is an integral part of such high-level reservoir analysis. In this work, we present a hybrid framework specific to reservoir analysis for an automatic detection of clusters in space using spatial and temporal field data, coupled with a physics-based multiscale modeling approach. In this work a novel hybrid approach is presented in which we couple a physics-based non-local modeling framework with data-driven clustering techniques to provide a fast and accurate multiscale modeling of compartmentalized reservoirs. This research also adds to the literature by presenting a comprehensive work on spatio-temporal clustering for reservoir studies applications that well considers the clustering complexities, the intrinsic sparse and noisy nature of the data, and the interpretability of the outcome. Keywords: Artificial Intelligence; Machine Learning; Spatio-Temporal Clustering; Physics-Based Data-Driven Formulation; Multiscale Modelin

    Spatio-temporal clustering of natural hazards

    Get PDF
    Natural hazards are inherently spatio-temporal processes. Spatio-temporal clustering methodologies applied to natural hazard data can help distinguish clustering patterns that would not only identify point-event dense regions and time periods, but also characterise the hazardous process. In Chapter 2, spatio-temporal clustering methodologies applicable to point event and trajectory datasets representative of natural hazards are reviewed by critically examining 143 scientific publications from various fields of study. These methodologies include clustering measures that are either (i) global (providing a single quantitative measure of the degree of clustering in the dataset) or (ii) local (i.e. assigning individual point events to a cluster). A common application and analysis framework of combining global and local measures for application to point event data is proposed. For global measures, K-functions analysis and for local measures, a space-time scan statistic with kernel density estimation as an aiding methodology within the framework are selected. For trajectories, a density-based local clustering measure Trajectory-OPTICS is selected. In Chapter 3, to assess the performance of the methodology framework, real-world natural hazard data and synthetic datasets, either representative of natural hazards or used as performance benchmarks for application, are presented and characterised. A point event dataset of 12,521 lightning strikes recorded on 1 July 2015 over the UK is selected, where a severe three-storm system crossed the region with different convective modes. It is also used as a case study together with a dataset of 77,252 lightning strikes on 28 June 2012 over the UK to characterise and model lightning strikes as point events produced by a moving source. Each source has a set number of points events, initiation point in space and time, movement speed, direction, inter-event time distribution and spatial spread distribution. Movement speed, inter-event time and spatial spread distributions are characterised based on the two case studies. Inter-event time values range from below 0.01 s to over 100 s for individual storms from both case studies. A least-squares plane fit in the spatio-temporal domain estimates a range of representative movement speed values of 47–60 km h–1 for the first and 66–111 km h–1 for the second case study. Based on these values, single (Model 3) and three storm (Model 4) models are generated to form a simulation study of point event datasets representing various physical lightning characteristics, each with three variations in their movement speed and spatial spread input parameters. For trajectories, the Atlantic hurricane database (HURDAT2) is used to select a real-world dataset of 316 hurricanes. Homogeneous and clustered trajectory datasets are generated as benchmarks for Trajectory-OPTICS. In Chapter 4, the clustering methodology framework identified in Chapter 2 is applied to all the real-world and synthetic datasets presented in Chapter 3. K-function analysis results are used to inform the range of bandwidth values for the kernel density estimation. A leave-one-out estimator is used to find the optimal values. A value threshold on the probability density values from the kernel density estimation is imposed to identify high probability density space-time volumes. These volumes are used as centroids for applying the scan statistic as a local clustering measure. The elliptic scan statistic is unable to identify individual lightning strike clusters within the same storm source for storm sources with small temporal separation (Model 4). Chapter 5 extends the elliptic scan statistic by including an ‘Inclination height’ parameter as the temporal distance between the major axis points of the ellipse basis. With detailed selection of input parameter ranges, the inclined elliptic scan statistic is applied to Model 4 and its variations and is able to identify point event cluster produced by a moving source and the point events assigned to the cluster are from the same storm source

    Spatio-temporal clustering in application

    Get PDF
    The importance of machine learning methods in the data analysis of both academic research and industry applications has advanced rapidly in recent years. This thesis will investigate how a method of unsupervised machine learning known as clustering can be employed to analyse spatial and spatio-temporal data from different fields of application. Spatio-temporal data present a particular challenge. In spatial contexts, the notion of dependency among geographically close elements needs to be considered when analysing the geographic distance as well as other spatial components. The temporal dimension of the data makes traditional dissimilarity metrics unsuitable due to the sequential ordering of data points. For this reason, this thesis will present ways of overcoming the shortcomings in existing methodologies when applied to these data types. By doing so, it will contribute to the literature on clustering through innovative extensions, adaptations, and considerations. The flexibility of clustering will be demonstrated in three different application contexts in health, finance, and marketing. As such, this thesis will also contribute to the academic literature in these areas and offer valuable insights into applicable machine learning methodology for practitioners

    Privacy preserving spatio-temporal clustering on horizontally partitioned data

    Get PDF
    Time-stamped location information is regarded as spatio-temporal data and, by its nature, such data is highly sensitive from the perspective of privacy. In this paper, we propose a privacy preserving spatio-temporal clustering method for horizontally partitioned data which, to the best of our knowledge, was not done before. Our methods are based on building the dissimilarity matrix through a series of secure multi-party trajectory comparisons managed by a third party. Our trajectory comparison protocol complies with most trajectory comparison functions and complexity analysis of our methods shows that our protocol does not introduce extra overhead when constructing dissimilarity matrix, compared to the centralized approach. This work was funded by the Information Society Technologies programme of the European Commission, Future and Emerging Technologies under IST-014915 GeoPKDD project
    • …
    corecore