271,272 research outputs found

    Spatial-temporal data mining procedure: LASR

    Full text link
    This paper is concerned with the statistical development of our spatial-temporal data mining procedure, LASR (pronounced ``laser''). LASR is the abbreviation for Longitudinal Analysis with Self-Registration of large-pp-small-nn data. It was motivated by a study of ``Neuromuscular Electrical Stimulation'' experiments, where the data are noisy and heterogeneous, might not align from one session to another, and involve a large number of multiple comparisons. The three main components of LASR are: (1) data segmentation for separating heterogeneous data and for distinguishing outliers, (2) automatic approaches for spatial and temporal data registration, and (3) statistical smoothing mapping for identifying ``activated'' regions based on false-discovery-rate controlled pp-maps and movies. Each of the components is of interest in its own right. As a statistical ensemble, the idea of LASR is applicable to other types of spatial-temporal data sets beyond those from the NMES experiments.Comment: Published at http://dx.doi.org/10.1214/074921706000000707 in the IMS Lecture Notes--Monograph Series (http://www.imstat.org/publications/lecnotes.htm) by the Institute of Mathematical Statistics (http://www.imstat.org

    GCG: Mining Maximal Complete Graph Patterns from Large Spatial Data

    Full text link
    Recent research on pattern discovery has progressed from mining frequent patterns and sequences to mining structured patterns, such as trees and graphs. Graphs as general data structure can model complex relations among data with wide applications in web exploration and social networks. However, the process of mining large graph patterns is a challenge due to the existence of large number of subgraphs. In this paper, we aim to mine only frequent complete graph patterns. A graph g in a database is complete if every pair of distinct vertices is connected by a unique edge. Grid Complete Graph (GCG) is a mining algorithm developed to explore interesting pruning techniques to extract maximal complete graphs from large spatial dataset existing in Sloan Digital Sky Survey (SDSS) data. Using a divide and conquer strategy, GCG shows high efficiency especially in the presence of large number of patterns. In this paper, we describe GCG that can mine not only simple co-location spatial patterns but also complex ones. To the best of our knowledge, this is the first algorithm used to exploit the extraction of maximal complete graphs in the process of mining complex co-location patterns in large spatial dataset.Comment: 1

    A Statistical Toolbox For Mining And Modeling Spatial Data

    Get PDF
    Most data mining projects in spatial economics start with an evaluation of a set of attribute variables on a sample of spatial entities, looking for the existence and strength of spatial autocorrelation, based on the Moran’s and the Geary’s coefficients, the adequacy of which is rarely challenged, despite the fact that when reporting on their properties, many users seem likely to make mistakes and to foster confusion. My paper begins by a critical appraisal of the classical definition and rational of these indices. I argue that while intuitively founded, they are plagued by an inconsistency in their conception. Then, I propose a principled small change leading to corrected spatial autocorrelation coefficients, which strongly simplifies their relationship, and opens the way to an augmented toolbox of statistical methods of dimension reduction and data visualization, also useful for modeling purposes. A second section presents a formal framework, adapted from recent work in statistical learning, which gives theoretical support to our definition of corrected spatial autocorrelation coefficients. More specifically, the multivariate data mining methods presented here, are easily implementable on the existing (free) software, yield methods useful to exploit the proposed corrections in spatial data analysis practice, and, from a mathematical point of view, whose asymptotic behavior, already studied in a series of papers by Belkin & Niyogi, suggests that they own qualities of robustness and a limited sensitivity to the Modifiable Areal Unit Problem (MAUP), valuable in exploratory spatial data analysis

    Spatial Data Preprocessing for Mining Spatial Association Rule with Conventional Association Mining Algorithms

    Get PDF
    The increasing usage of Geographical Information Systems (GIS) for various problems makes the volume of spatial data is growing fast. Spatial data mining is one of the several ways to find the new knowledge from data collection. One of spatial data mining tasks is spatial association rule. There are numerous association rule algorithms have been developed for mining association. Unfortunately, the most algorithms can only used for mining non-spatial and specific formatted data. Therefore, spatial data preprocessing is needed in order conventional association algorithms can be used for spatial data

    Text and spatial data mining

    Get PDF
    Parcellation of the human brain Parcellation of the human brain by combining text mining and spatial data mining within a neuroinformatics database. Text mining: Analysis of scientific abstracts. Spatial data mining: Modeling of the distribution of Talairach coordinates. Seek communality between the the text representation and spatial representation by multivariate analysis
    corecore