271,272 research outputs found
Spatial-temporal data mining procedure: LASR
This paper is concerned with the statistical development of our
spatial-temporal data mining procedure, LASR (pronounced ``laser''). LASR is
the abbreviation for Longitudinal Analysis with Self-Registration of
large--small- data. It was motivated by a study of ``Neuromuscular
Electrical Stimulation'' experiments, where the data are noisy and
heterogeneous, might not align from one session to another, and involve a large
number of multiple comparisons. The three main components of LASR are: (1) data
segmentation for separating heterogeneous data and for distinguishing outliers,
(2) automatic approaches for spatial and temporal data registration, and (3)
statistical smoothing mapping for identifying ``activated'' regions based on
false-discovery-rate controlled -maps and movies. Each of the components is
of interest in its own right. As a statistical ensemble, the idea of LASR is
applicable to other types of spatial-temporal data sets beyond those from the
NMES experiments.Comment: Published at http://dx.doi.org/10.1214/074921706000000707 in the IMS
Lecture Notes--Monograph Series
(http://www.imstat.org/publications/lecnotes.htm) by the Institute of
Mathematical Statistics (http://www.imstat.org
GCG: Mining Maximal Complete Graph Patterns from Large Spatial Data
Recent research on pattern discovery has progressed from mining frequent
patterns and sequences to mining structured patterns, such as trees and graphs.
Graphs as general data structure can model complex relations among data with
wide applications in web exploration and social networks. However, the process
of mining large graph patterns is a challenge due to the existence of large
number of subgraphs. In this paper, we aim to mine only frequent complete graph
patterns. A graph g in a database is complete if every pair of distinct
vertices is connected by a unique edge. Grid Complete Graph (GCG) is a mining
algorithm developed to explore interesting pruning techniques to extract
maximal complete graphs from large spatial dataset existing in Sloan Digital
Sky Survey (SDSS) data. Using a divide and conquer strategy, GCG shows high
efficiency especially in the presence of large number of patterns. In this
paper, we describe GCG that can mine not only simple co-location spatial
patterns but also complex ones. To the best of our knowledge, this is the first
algorithm used to exploit the extraction of maximal complete graphs in the
process of mining complex co-location patterns in large spatial dataset.Comment: 1
A Statistical Toolbox For Mining And Modeling Spatial Data
Most data mining projects in spatial economics start with an evaluation of a set of attribute variables on a sample of spatial entities, looking for the existence and strength of spatial autocorrelation, based on the Moran’s and the Geary’s coefficients, the adequacy of which is rarely challenged, despite the fact that when reporting on their properties, many users seem likely to make mistakes and to foster confusion. My paper begins by a critical appraisal of the classical definition and rational of these indices. I argue that while intuitively founded, they are plagued by an inconsistency in their conception. Then, I propose a principled small change leading to corrected spatial autocorrelation coefficients, which strongly simplifies their relationship, and opens the way to an augmented toolbox of statistical methods of dimension reduction and data visualization, also useful for modeling purposes. A second section presents a formal framework, adapted from recent work in statistical learning, which gives theoretical support to our definition of corrected spatial autocorrelation coefficients. More specifically, the multivariate data mining methods presented here, are easily implementable on the existing (free) software, yield methods useful to exploit the proposed corrections in spatial data analysis practice, and, from a mathematical point of view, whose asymptotic behavior, already studied in a series of papers by Belkin & Niyogi, suggests that they own qualities of robustness and a limited sensitivity to the Modifiable Areal Unit Problem (MAUP), valuable in exploratory spatial data analysis
Spatial Data Preprocessing for Mining Spatial Association Rule with Conventional Association Mining Algorithms
The increasing usage of Geographical Information Systems (GIS) for various problems makes the volume of spatial data is growing fast. Spatial data mining is one of the several ways to find the new knowledge from data collection. One of spatial data mining tasks is spatial association rule. There are numerous association rule algorithms have been developed for mining association. Unfortunately, the most algorithms can only used for mining non-spatial and specific formatted data. Therefore, spatial data preprocessing is needed in order conventional association algorithms can be used for spatial data
Text and spatial data mining
Parcellation of the human brain Parcellation of the human brain by combining text mining and spatial data mining within a neuroinformatics database. Text mining: Analysis of scientific abstracts. Spatial data mining: Modeling of the distribution of Talairach coordinates. Seek communality between the the text representation and spatial representation by multivariate analysis
- …