3,146,052 research outputs found

    Understanding spatial data usability

    Get PDF
    In recent geographical information science literature, a number of researchers have made passing reference to an apparently new characteristic of spatial data known as 'usability'. While this attribute is well-known to professionals engaged in software engineering and computer interface design and testing, extension of the concept to embrace information would seem to be a new development. Furthermore, while notions such as the use and value of spatial information, and the diffusion of spatial information systems, have been the subject of research since the late-1980s, the current references to usability clearly represent something which extends well beyond that initial research. Accordingly, the purposes of this paper are: (1) to understand what is meant by spatial data usability; (2) to identify the elements that might comprise usability; and (3) to consider what the related research questions might be

    Refining Coarse-grained Spatial Data using Auxiliary Spatial Data Sets with Various Granularities

    Full text link
    We propose a probabilistic model for refining coarse-grained spatial data by utilizing auxiliary spatial data sets. Existing methods require that the spatial granularities of the auxiliary data sets are the same as the desired granularity of target data. The proposed model can effectively make use of auxiliary data sets with various granularities by hierarchically incorporating Gaussian processes. With the proposed model, a distribution for each auxiliary data set on the continuous space is modeled using a Gaussian process, where the representation of uncertainty considers the levels of granularity. The fine-grained target data are modeled by another Gaussian process that considers both the spatial correlation and the auxiliary data sets with their uncertainty. We integrate the Gaussian process with a spatial aggregation process that transforms the fine-grained target data into the coarse-grained target data, by which we can infer the fine-grained target Gaussian process from the coarse-grained data. Our model is designed such that the inference of model parameters based on the exact marginal likelihood is possible, in which the variables of fine-grained target and auxiliary data are analytically integrated out. Our experiments on real-world spatial data sets demonstrate the effectiveness of the proposed model.Comment: Appears in Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence (AAAI 2019

    Is spatial information in ICT data reliable?

    Get PDF
    An increasing number of human activities are studied using data produced by individuals' ICT devices. In particular, when ICT data contain spatial information, they represent an invaluable source for analyzing urban dynamics. However, there have been relatively few contributions investigating the robustness of this type of results against fluctuations of data characteristics. Here, we present a stability analysis of higher-level information extracted from mobile phone data passively produced during an entire year by 9 million individuals in Senegal. We focus on two information-retrieval tasks: (a) the identification of land use in the region of Dakar from the temporal rhythms of the communication activity; (b) the identification of home and work locations of anonymized individuals, which enable to construct Origin-Destination (OD) matrices of commuting flows. Our analysis reveal that the uncertainty of results highly depends on the sample size, the scale and the period of the year at which the data were gathered. Nevertheless, the spatial distributions of land use computed for different samples are remarkably robust: on average, we observe more than 75% of shared surface area between the different spatial partitions when considering activity of at least 100,000 users whatever the scale. The OD matrix is less stable and depends on the scale with a share of at least 75% of commuters in common when considering all types of flows constructed from the home-work locations of 100,000 users. For both tasks, better results can be obtained at larger levels of aggregation or by considering more users. These results confirm that ICT data are very useful sources for the spatial analysis of urban systems, but that their reliability should in general be tested more thoroughly.Comment: 11 pages, 9 figures + Appendix, Extended version of the conference paper published in the proceedings of the 2016 Spatial Accuracy Conference, p 9-17, Montpellier, Franc

    Spatial interpolation of high-frequency monitoring data

    Full text link
    Climate modelers generally require meteorological information on regular grids, but monitoring stations are, in practice, sited irregularly. Thus, there is a need to produce public data records that interpolate available data to a high density grid, which can then be used to generate meteorological maps at a broad range of spatial and temporal scales. In addition to point predictions, quantifications of uncertainty are also needed. One way to accomplish this is to provide multiple simulations of the relevant meteorological quantities conditional on the observed data taking into account the various uncertainties in predicting a space-time process at locations with no monitoring data. Using a high-quality dataset of minute-by-minute measurements of atmospheric pressure in north-central Oklahoma, this work describes a statistical approach to carrying out these conditional simulations. Based on observations at 11 stations, conditional simulations were produced at two other sites with monitoring stations. The resulting point predictions are very accurate and the multiple simulations produce well-calibrated prediction uncertainties for temporal changes in atmospheric pressure but are substantially overconservative for the uncertainties in the predictions of (undifferenced) pressure.Comment: Published in at http://dx.doi.org/10.1214/08-AOAS208 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Reducing Spatial Data Complexity for Classification Models

    Get PDF
    Intelligent data analytics gradually becomes a day-to-day reality of today's businesses. However, despite rapidly increasing storage and computational power current state-of-the-art predictive models still can not handle massive and noisy corporate data warehouses. What is more adaptive and real-time operational environment requires multiple models to be frequently retrained which fiirther hinders their use. Various data reduction techniques ranging from data sampling up to density retention models attempt to address this challenge by capturing a summarised data structure, yet they either do not account for labelled data or degrade the classification performance of the model trained on the condensed dataset. Our response is a proposition of a new general framework for reducing the complexity of labelled data by means of controlled spatial redistribution of class densities in the input space. On the example of Parzen Labelled Data Compressor (PLDC) we demonstrate a simulatory data condensation process directly inspired by the electrostatic field interaction where the data are moved and merged following the attracting and repelling interactions with the other labelled data. The process is controlled by the class density function built on the original data that acts as a class-sensitive potential field ensuring preservation of the original class density distributions, yet allowing data to rearrange and merge joining together their soft class partitions. As a result we achieved a model that reduces the labelled datasets much further than any competitive approaches yet with the maximum retention of the original class densities and hence the classification performance. PLDC leaves the reduced dataset with the soft accumulative class weights allowing for efficient online updates and as shown in a series of experiments if coupled with Parzen Density Classifier (PDC) significantly outperforms competitive data condensation methods in terms of classification performance at the comparable compression levels

    Regularized Principal Component Analysis for Spatial Data

    Full text link
    In many atmospheric and earth sciences, it is of interest to identify dominant spatial patterns of variation based on data observed at pp locations and nn time points with the possibility that p>np>n. While principal component analysis (PCA) is commonly applied to find the dominant patterns, the eigenimages produced from PCA may exhibit patterns that are too noisy to be physically meaningful when pp is large relative to nn. To obtain more precise estimates of eigenimages, we propose a regularization approach incorporating smoothness and sparseness of eigenimages, while accounting for their orthogonality. Our method allows data taken at irregularly spaced or sparse locations. In addition, the resulting optimization problem can be solved using the alternating direction method of multipliers, which is easy to implement, and applicable to a large spatial dataset. Furthermore, the estimated eigenfunctions provide a natural basis for representing the underlying spatial process in a spatial random-effects model, from which spatial covariance function estimation and spatial prediction can be efficiently performed using a regularized fixed-rank kriging method. Finally, the effectiveness of the proposed method is demonstrated by several numerical example

    Forecasting with Spatial Panel Data

    Get PDF
    This paper compares various forecasts using panel data with spatial error correlation. The true data generating process is assumed to be a simple error component regression model with spatial remainder disturbances of the autoregressive or moving average type. The best linear unbiased predictor is compared with other forecasts ignoring spatial correlation, or ignoring heterogeneity due to the individual effects, using Monte Carlo experiments. In addition, we check the performance of these forecasts under misspecification of the spatial error process, various spatial weight matrices, and heterogeneous rather than homogeneous panel data models.forecasting, BLUP, panel data, spatial dependence, heterogeneity
    corecore