120 research outputs found
Estimating Abundance from Counts in Large Data Sets of Irregularly-Spaced Plots using Spatial Basis Functions
Monitoring plant and animal populations is an important goal for both
academic research and management of natural resources. Successful management of
populations often depends on obtaining estimates of their mean or total over a
region. The basic problem considered in this paper is the estimation of a total
from a sample of plots containing count data, but the plot placements are
spatially irregular and non randomized. Our application had counts from
thousands of irregularly-spaced aerial photo images. We used change-of-support
methods to model counts in images as a realization of an inhomogeneous Poisson
process that used spatial basis functions to model the spatial intensity
surface. The method was very fast and took only a few seconds for thousands of
images. The fitted intensity surface was integrated to provide an estimate from
all unsampled areas, which is added to the observed counts. The proposed method
also provides a finite area correction factor to variance estimation. The
intensity surface from an inhomogeneous Poisson process tends to be too smooth
for locally clustered points, typical of animal distributions, so we introduce
several new overdispersion estimators due to poor performance of the classic
one. We used simulated data to examine estimation bias and to investigate
several variance estimators with overdispersion. A real example is given of
harbor seal counts from aerial surveys in an Alaskan glacial fjord.Comment: 37 pages, 7 figures, 4 tables, keywords: sampling, change-of-support,
spatial point processes, intensity function, random effects, Poisson process,
overdispersio
Space–time zero-inflated count models of Harbor seals
Environmental data are spatial, temporal, and often come with many zeros. In this paper, we included space–time random effects in zero-inflated Poisson (ZIP) and ‘hurdle’ models to investigate haulout patterns of harbor seals on glacial ice. The data consisted of counts, for 18 dates on a lattice grid of samples, of harbor seals hauled out on glacial ice in Disenchantment Bay, near Yakutat, Alaska. A hurdle model is similar to a ZIP model except it does not mix zeros from the binary and count processes. Both models can be used for zero-inflated data, and we compared space–time ZIP and hurdle models in a Bayesian hierarchical model. Space–time ZIP and hurdle models were constructed by using spatial conditional autoregressive (CAR) models and temporal first-order autoregressive (AR(1)) models as random effects in ZIP and hurdle regression models. We created maps of smoothed predictions for harbor seal counts based on ice density, other covariates, and spatio-temporal random effects. For both models predictions around the edges appeared to be positively biased. The linex loss function is an asymmetric loss function that penalizes overprediction more than underprediction, and we used it to correct for prediction bias to get the best map for space–time ZIP and hurdle models
A mixed-model moving-average approach to geostatistical modeling in stream networks
Spatial autocorrelation is an intrinsic characteristic in freshwater stream environments where nested watersheds and flow connectivity may produce patterns that are not captured by Euclidean distance. Yet, many common autocovariance functions used in geostatistical models are statistically invalid when Euclidean distance is replaced with hydrologic distance. We use simple worked examples to illustrate a recently developed moving-average approach used to construct two types of valid autocovariance models that are based on hydrologic distances. These models were designed to represent the spatial configuration, longitudinal connectivity, discharge, and flow direction in a stream network. They also exhibit a different covariance structure than Euclidean models and represent a true difference in the way that spatial relationships are represented. Nevertheless, the multi-scale complexities of stream environments may not be fully captured using a model based on one covariance structure. We advocate using a variance component approach, which allows a mixture of autocovariance models (Euclidean and stream models) to be incorporated into a single geostatistical model. As an example, we fit and compare ‘‘mixed models,’’ based on multiple covariance structures, for a biological indicator. The mixed model proves to be a flexible approach because many sources of information can be incorporated into a single model
Modeling growth of mandibles in the Western Arctic caribou herd
We compared growth curves for ramus length and diastema length from two autumn collections of mandibles of male Western Arctic Herd caribou in Alaska. We were primarily interested in determining if growth curves of caribou mandibles differed between caribou born during 1959-1967, after the herd had been high for several years and was probably declining in size, and those born during 1976-1988, when the herd was increasing in size. To compare these growth curves, we used a nonlinear model and used maximum likelihood estimates and likelihood ratio tests. We found that growth rates were similar between periods, but intercepts and variances of growth curves differed. From this we infer that calves were smaller in autumn during the 1960s and that significant compensatory growth did not occur later in life
Evaluation of the spatial linear model, random forest and gradient nearest-neighbour methods for imputing potential productivity and biomass of the Pacific Northwest forests
Increasingly, forest management and conservation plans require spatially explicit information within a management or conservation unit. Forest biomass and potential productivity are critical variables for forest planning and assessment in the Pacific Northwest. Their values are often estimated from ground-measured sample data. For unsampled locations, forest analysts and planners lack forest productivity and biomass values, so values must be predicted. Using simulated data and forest inventory and analysis data collected in Oregon and Washington, we examined the performance of the spatial linear model (SLM), random forest (RF) and gradient nearest neighbour (GNN) for mapping and estimating biomass and potential productivity of Pacific Northwest forests. Simulations of artificial populations and subsamplings of forest biomass and productivity data showed that the SLM had smaller empirical root-mean-squared prediction errors (RMSPE) for a wide variety of data types, with generally less bias and better interval coverage than RFand GNN. These patterns held for both point predictions and for population averages, with the SLM reducing RMSPE by 30.0 and 52.6 per cent over two GNN methods in predicting point estimates for forest biomass and potential productivity
Recommended from our members
Evaluation of the spatial linear model, random forest and gradient nearest-neighbour methods for imputing potential productivity and biomass of the Pacific Northwest forests
Increasingly, forest management and conservation plans require spatially explicit information within a management
or conservation unit. Forest biomass and potential productivity are critical variables for forest planning and
assessment in the Pacific Northwest. Their values are often estimated from ground-measured sample data. For
unsampled locations, forest analysts and planners lack forest productivity and biomass values, so values must
be predicted. Using simulated data and forest inventory and analysis data collected in Oregon and Washington,
we examined the performance of the spatial linear model (SLM), random forest (RF) and gradient nearest neighbour
(GNN) for mapping and estimating biomass and potential productivity of Pacific Northwest forests. Simulations
of artificial populations and subsamplings of forest biomass and productivity data showed that the SLM
had smaller empirical root-mean-squared prediction errors (RMSPE) fora wide variety of data types, with generally
less bias and better interval coverage than RF and GNN. These patterns held for both point predictions and for population
averages, with the SLM reducing RMSPE by 30.0 and 52.6 per cent over two GNN methods in predicting point
estimates for forest biomass and potential productivity
Recommended from our members
A Comparison of the Spatial Linear Model to Nearest Neighbor (k-NN) Methods for Forestry Applications
Forest surveys provide critical information for many diverse interests. Data are often collected from samples, and from these samples, maps of resources and estimates of aerial totals or averages are required. In this paper, two approaches for mapping and estimating totals; the spatial linear model (SLM) and k-NN (k-Nearest Neighbor) are compared, theoretically, through simulations, and as applied to real forestry data. While both methods have desirable properties, a review shows that the SLM has prediction optimality properties, and can be quite robust. Simulations of artificial populations and resamplings of real forestry data show that the SLM has smaller empirical root-mean-squared prediction errors (RMSPE) for a wide variety of data types, with generally less bias and better interval coverage than k-NN. These patterns held for both point predictions and for population totals or averages, with the SLM reducing RMSPE from 9% to 67% over some popular k-NN methods, with SLM also more robust to spatially imbalanced sampling. Estimating prediction standard errors remains a problem for k-NN predictors, despite recent attempts using model-based methods. Our conclusions are that the SLM should generally be used rather than k-NN if the goal is accurate mapping or estimation of population totals or averages
- …