Search CORE

270 research outputs found

yaImpute: An R Package for kNN Imputation

Author: Andrew O. Finley
Nicholas L. Crookston
Publication venue
Publication date
Field of study

This article introduces yaImpute, an R package for nearest neighbor search and imputation. Although nearest neighbor imputation is used in a host of disciplines, the methods implemented in the yaImpute package are tailored to imputation-based forest attribute estimation and mapping. The impetus to writing the yaImpute is a growing interest in nearest neighbor imputation methods for spatially explicit forest inventory, and a need within this research community for software that facilitates comparison among different nearest neighbor search algorithms and subsequent imputation techniques. yaImpute provides directives for defining the search space, subsequent distance calculation, and imputation rules for a given number of nearest neighbors. Further, the package offers a suite of diagnostics for comparison among results generated from different imputation analyses and a set of functions for mapping imputation results.

Research Papers in Economics

Hierarchical spatial models for predicting tree species assemblages across large domains

Author: Banerjee Sudipto
Finley Andrew O.
McRoberts Ronald E.
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 08/10/2009
Field of study

Spatially explicit data layers of tree species assemblages, referred to as forest types or forest type groups, are a key component in large-scale assessments of forest sustainability, biodiversity, timber biomass, carbon sinks and forest health monitoring. This paper explores the utility of coupling georeferenced national forest inventory (NFI) data with readily available and spatially complete environmental predictor variables through spatially-varying multinomial logistic regression models to predict forest type groups across large forested landscapes. These models exploit underlying spatial associations within the NFI plot array and the spatially-varying impact of predictor variables to improve the accuracy of forest type group predictions. The richness of these models incurs onerous computational burdens and we discuss dimension reducing spatial processes that retain the richness in modeling. We illustrate using NFI data from Michigan, USA, where we provide a comprehensive analysis of this large study area and demonstrate improved prediction with associated measures of uncertainty.Comment: Published in at http://dx.doi.org/10.1214/09-AOAS250 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Crossref

spBayes: An R Package for Univariate and Multivariate Hierarchical Point-referenced Spatial Models

Author: Andrew O. Finley
Bradley P. Carlin
Sudipto Banerjee
Publication venue
Publication date
Field of study

Scientists and investigators in such diverse fields as geological and environmental sciences, ecology, forestry, disease mapping, and economics often encounter spatially referenced data collected over a fixed set of locations with coordinates (latitude-longitude, Easting-Northing etc.) in a region of study. Such point-referenced or geostatistical data are often best analyzed with Bayesian hierarchical models. Unfortunately, fitting such models involves computationally intensive Markov chain Monte Carlo (MCMC) methods whose efficiency depends upon the specific problem at hand. This requires extensive coding on the part of the user and the situation is not helped by the lack of available software for such algorithms. Here, we introduce a statistical software package, spBayes, built upon the R statistical computing platform that implements a generalized template encompassing a wide variety of Gaussian spatial process models for univariate as well as multivariate point-referenced data. We discuss the algorithms behind our package and illustrate its use with a synthetic and real data example.

Research Papers in Economics

Hierarchical Nearest-Neighbor Gaussian Process Models for Large Geostatistical Datasets

Author: Banerjee Sudipto
Datta Abhirup
Finley Andrew O.
Gelfand Alan E.
Publication venue
Publication date: 01/01/2014
Field of study

Spatial process models for analyzing geostatistical data entail computations that become prohibitive as the number of spatial locations become large. This manuscript develops a class of highly scalable Nearest Neighbor Gaussian Process (NNGP) models to provide fully model-based inference for large geostatistical datasets. We establish that the NNGP is a well-defined spatial process providing legitimate finite-dimensional Gaussian densities with sparse precision matrices. We embed the NNGP as a sparsity-inducing prior within a rich hierarchical modeling framework and outline how computationally efficient Markov chain Monte Carlo (MCMC) algorithms can be executed without storing or decomposing large matrices. The floating point operations (flops) per iteration of this algorithm is linear in the number of spatial locations, thereby rendering substantial scalability. We illustrate the computational and inferential benefits of the NNGP over competing methods using simulation studies and also analyze forest biomass from a massive United States Forest Inventory dataset at a scale that precludes alternative dimension-reducing methods

arXiv.org e-Print Archive

CiteSeerX

Crossref

eScholarship - University of California

The Francis Crick Institute

yaImpute: An R Package for kNN Imputation

Author: Crookston Nicholas L.
Finley Andrew O.
Publication venue: 'Foundation for Open Access Statistic'
Publication date: 01/10/2007
Field of study

Crossref

Directory of Open Access Journals

Journal of Statistical Software