3 research outputs found

    Spatial regression in large datasets: problem set solution

    Get PDF
    In this dissertation we investigate a possible attempt to combine the Data Mining methods and traditional Spatial Autoregressive models, in the context of large spatial datasets. We start to considere the numerical difficulties to handle massive datasets by the usual approach based on Maximum Likelihood estimation for spatial models and Spatial Two-Stage Least Squares. So, we conduct an experiment by Monte Carlo simulations to compare the accuracy and computational complexity for decomposition and approximation techniques to solve the problem of computing the Jacobian in spatial models, for various regular lattice structures. In particular, we consider one of the most common spatial econometric models: spatial lag (or SAR, spatial autoregressive model). Also, we provide new evidences in the literature, by examining the double effect on computational complexity of these methods: the influence of "size effect" and "sparsity effect". To overcome this computational problem, we propose a data mining methodology as CART (Classification and Regression Tree) that explicitly considers the phenomenon of spatial autocorrelation on pseudo-residuals, in order to remove this effect and to improve the accuracy, with significant saving in computational complexity in wide range of spatial datasets: realand simulated data

    Knowledge Discovery from spatial transactions

    No full text
    We propose a general mechanism to represent the spatial transactions in a way that allows the use of the existing data mining methods. Our proposal allows the analyst to exploit the layered structure of geographical information systems in order to define the layers of interest and the relevant spatial relations among them. Given a reference object, it is possible to describe its neighborhood by considering the attribute of the object itself and the objects related by the chosen relations. The resulting spatial transactions may be either considered like “traditional” transactions, by considering only the qualitative spatial relations, or their spatial extension can be exploited during the data mining process. We explore both these cases. First we tackle the problem of classifying a spatial dataset, by taking into account the spatial component of the data to compute the statistical measure (i.e., the entropy) necessary to learn the model. Then, we consider the task of extracting spatial association rules, by focusing on the qualitative representation of the spatial relations. The feasibility of the process has been tested by implementing the proposed method on top of a GIS tool and by analyzing real world data
    corecore