3 research outputs found

    Detecting arbitrarily shaped clusters using ant colony optimization

    No full text
    In the map of geo-referenced population and cases, the detection of the most likely cluster (MLC), which is made up of many connected polygons (e. g., the boundaries of census tracts), may face two difficulties. One is the irregularity of the shape of the cluster and the other is the heterogeneity of the cluster. A heterogeneous cluster is referred to as the cluster containing depression links (a polygon is a depression link if it satisfies two conditions: (1) the ratio between the case number and the population in the polygon is below the average ratio of the whole map; (2) the removal of the polygon will disconnect the cluster). Previous studies have successfully solved the problem of detecting arbitrarily shaped clusters not containing depression links. However, for a heterogeneous cluster, existing methods may generate mistakes, for example, missing some parts of the cluster. In this article, a spatial scanning method based on the ant colony optimization (AntScan) is proposed to improve the detection power. If a polygon can be simplified as a node, the research area consisting of many polygons then can be seen as a graph. So the detection of the MLC can be seen as the search of the best subgraph (with the largest likelihood value) in the graph. The comparison between AntScan, GAScan (the spatial scan method based on the genetic optimization), and SAScan (the spatial scan method based on the simulated annealing optimization) indicates that (1) the performance of GAScan and SAScan is significantly influenced by the parameter of the fraction value (the maximum allowed size of the detected cluster), which can only be estimated by multiple trials, while no such parameter is needed in AntScan; (2) AntScan shows superior power over GAScan and SAScan in detecting heterogeneous clusters. The case study on esophageal cancer in North China demonstrates that the cluster identified by AntScan has the larger likelihood value than that detected by SAScan and covers all high-risk regions of esophageal cancer whereas SAScan misses some high-risk regions (the region in the southwest of Shandong province, eastern China) due to the existence of a depression link

    Data-intensive spatial pattern discovery based on generalized spatial point representations

    Get PDF
    Geospatial big data consisting of records at the individual level or with fine spatial resolutions, such as geo-referenced social media posts and movement records collected using GPS, provide tremendous opportunities to understand complex geographic phenomena and their space-time dynamics. Such data have been widely used in many real-world applications, such as event detection and population migration analyses. These applications require not only efficient data handling and processing capabilities, but also innovative data models and analytical approaches that satisfy application-specific requirements. The aim of this dissertation research is to establish a suite of innovative methods for analyzing geospatial big data that can be modeled as generalized spatial points while addressing the following key research questions: how to estimate the spatial and spatiotemporal patterns of geographic phenomena from geospatial big data based on spatial point models? How to compare these patterns to gain insights into complex geographic phenomena? How to estimate the computational intensity of the methods? How can cyberGIS be advanced to resolve the computational intensity? Specifically, novel methods are designed in this dissertation research to exploit spatial data characteristics, innovate spatial point pattern analytics, and resolve computational intensity through high-performance spatial algorithms. Such methods are evaluated in the context of several real-world applications, including event detection from social media data and spatial movement pattern detection. Experiment results demonstrated that fine-scale spatial patterns can be revealed from geospatial big data using the proposed approaches. Novel cyberGIS software capabilities are also created as a result of this dissertation research