5,518 research outputs found

    Performance of the supervised generative classifiers of spatio-temporal areal data using various spatial autocorrelation indexes

    Get PDF
    This article is concerned with a generative approach to supervised classification of spatio-temporal data collected at fixed areal units and modeled by Gaussian Markov random field. We focused on the classifiers based on Bayes discriminant functions formed by the log-ratio of the class conditional likelihoods. As a novel modeling contribution, we propose to use decision threshold values induced by three popular spatial autocorrelation indexes, i.e., Moran’s I, Geary’s C and Getis–Ord G. The goal of this study is to extend the recent investigations in the context of geostatistical and hidden Markov Gaussian models to one in the context of areal Gaussian Markov models. The classifiers performance measures are chosen to be the average accuracy rate, which shows the percentage of correctly classified test data, balanced accuracy rate specified by the average of sensitivity and specificity and the geometric mean of sensitivity and specificity. The proposed methodology is illustrated using annual death rate data collected by the Institute of Hygiene of the Republic of Lithuania from the 60 unicipalities in the period from 2001 to 2019. Classification model selection procedure is illustrated on three data sets with class labels specified by the threshold to mortality index due to acute cardiovascular event, malignant neoplasms and diseases of the circulatory system. Presented critical comparison among proposed approach classifiers with various spatial autocorrelation indexes (decision threshold values) and classifier based hidden Markov model can aid in the selection of proper classification techniques for the spatio-temporal areal data

    Discrete versus continuous domain models for disease mapping

    Get PDF
    The main goal of disease mapping is to estimate disease risk and identify high-risk areas. Such analyses are hampered by the limited geographical resolution of the available data. Typically the available data are counts per spatial unit and the common approach is the Besag--York--Molli{\'e} (BYM) model. When precise geocodes are available, it is more natural to use Log-Gaussian Cox processes (LGCPs). In a simulation study mimicking childhood leukaemia incidence using actual residential locations of all children in the canton of Z\"urich, Switzerland, we compare the ability of these models to recover risk surfaces and identify high-risk areas. We then apply both approaches to actual data on childhood leukaemia incidence in the canton of Z\"urich during 1985-2015. We found that LGCPs outperform BYM models in almost all scenarios considered. Our findings suggest that there are important gains to be made from the use of LGCPs in spatial epidemiology.Comment: 28 pages, 4 figures, 2 Table

    CARBayes: an R package for Bayesian spatial modeling with conditional autoregressive priors

    Get PDF
    Conditional autoregressive models are commonly used to represent spatial autocorrelation in data relating to a set of non-overlapping areal units, which arise in a wide variety of applications including agriculture, education, epidemiology and image analysis. Such models are typically specified in a hierarchical Bayesian framework, with inference based on Markov chain Monte Carlo (MCMC) simulation. The most widely used software to fit such models is WinBUGS or OpenBUGS, but in this paper we introduce the R package CARBayes. The main advantage of CARBayes compared with the BUGS software is its ease of use, because: (1) the spatial adjacency information is easy to specify as a binary neighbourhood matrix; and (2) given the neighbourhood matrix the models can be implemented by a single function call in R. This paper outlines the general class of Bayesian hierarchical models that can be implemented in the CARBayes software, describes their implementation via MCMC simulation techniques, and illustrates their use with two worked examples in the fields of house price analysis and disease mapping

    Stochastic partial differential equation based modelling of large space-time data sets

    Full text link
    Increasingly larger data sets of processes in space and time ask for statistical models and methods that can cope with such data. We show that the solution of a stochastic advection-diffusion partial differential equation provides a flexible model class for spatio-temporal processes which is computationally feasible also for large data sets. The Gaussian process defined through the stochastic partial differential equation has in general a nonseparable covariance structure. Furthermore, its parameters can be physically interpreted as explicitly modeling phenomena such as transport and diffusion that occur in many natural processes in diverse fields ranging from environmental sciences to ecology. In order to obtain computationally efficient statistical algorithms we use spectral methods to solve the stochastic partial differential equation. This has the advantage that approximation errors do not accumulate over time, and that in the spectral space the computational cost grows linearly with the dimension, the total computational costs of Bayesian or frequentist inference being dominated by the fast Fourier transform. The proposed model is applied to postprocessing of precipitation forecasts from a numerical weather prediction model for northern Switzerland. In contrast to the raw forecasts from the numerical model, the postprocessed forecasts are calibrated and quantify prediction uncertainty. Moreover, they outperform the raw forecasts, in the sense that they have a lower mean absolute error
    corecore