Stochastic simulation and spatial statistics of large datasets using parallel computing

Abstract

Lattice models are a way of representing spatial locations in a grid where each cell is in a certain state and evolves according to transition rules and rates dependent on a surrounding neighbourhood. These models are capable of describing many phenomena such as the simulation and growth of a forest fire front. These spatial simulation models as well as spatial descriptive statistics such as Ripley\u27s K-function have wide applicability in spatial statistics but in general do not scale well for large datasets. Parallel computing (high performance computing) is one solution that can provide limited scalability to these applications. This is done using the message passing interface (MPI) framework implemented in R through the Rmpi package. Other useful techniques in spatial statistics such as point pattern reconstruction and Markov Chain Monte Carlo (MCMC) methods are discussed from a parallel computing perspective as well. In particular, an improved point pattern reconstruction is given and implemented in parallel. Single chain MCMC methods are also examined and improved upon to give faster convergence using parallel computing. Optimizations, and complications that arise from parallelizing existing spatial statistics algorithms are discussed and methods are implemented in an accompanying R package, parspatstat

    Similar works