1,344 research outputs found

    High-Dimensional Bayesian Geostatistics

    Full text link
    With the growing capabilities of Geographic Information Systems (GIS) and user-friendly software, statisticians today routinely encounter geographically referenced data containing observations from a large number of spatial locations and time points. Over the last decade, hierarchical spatiotemporal process models have become widely deployed statistical tools for researchers to better understand the complex nature of spatial and temporal variability. However, fitting hierarchical spatiotemporal models often involves expensive matrix computations with complexity increasing in cubic order for the number of spatial locations and temporal points. This renders such models unfeasible for large data sets. This article offers a focused review of two methods for constructing well-defined highly scalable spatiotemporal stochastic processes. Both these processes can be used as "priors" for spatiotemporal random fields. The first approach constructs a low-rank process operating on a lower-dimensional subspace. The second approach constructs a Nearest-Neighbor Gaussian Process (NNGP) that ensures sparse precision matrices for its finite realizations. Both processes can be exploited as a scalable prior embedded within a rich hierarchical modeling framework to deliver full Bayesian inference. These approaches can be described as model-based solutions for big spatiotemporal datasets. The models ensure that the algorithmic complexity has n\sim n floating point operations (flops), where nn the number of spatial locations (per iteration). We compare these methods and provide some insight into their methodological underpinnings

    Skellam shrinkage: Wavelet-based intensity estimation for inhomogeneous Poisson data

    Full text link
    The ubiquity of integrating detectors in imaging and other applications implies that a variety of real-world data are well modeled as Poisson random variables whose means are in turn proportional to an underlying vector-valued signal of interest. In this article, we first show how the so-called Skellam distribution arises from the fact that Haar wavelet and filterbank transform coefficients corresponding to measurements of this type are distributed as sums and differences of Poisson counts. We then provide two main theorems on Skellam shrinkage, one showing the near-optimality of shrinkage in the Bayesian setting and the other providing for unbiased risk estimation in a frequentist context. These results serve to yield new estimators in the Haar transform domain, including an unbiased risk estimate for shrinkage of Haar-Fisz variance-stabilized data, along with accompanying low-complexity algorithms for inference. We conclude with a simulation study demonstrating the efficacy of our Skellam shrinkage estimators both for the standard univariate wavelet test functions as well as a variety of test images taken from the image processing literature, confirming that they offer substantial performance improvements over existing alternatives.Comment: 27 pages, 8 figures, slight formatting changes; submitted for publicatio

    Spatial modeling using graphical Markov models and wavelets

    Get PDF
    Graphical Markov models use graphs to represent possible dependencies among random variables. This class of models is extremely rich and includes inter alia causal Markov models and Markov random fields. In this dissertation, we develop a very efficient optimal-prediction algorithm for graphical Markov models. The algorithm is a generalization of the Kalman-filter algorithm for temporal processes, and it can in principle be applied to any Gaussian undirected graphical model and any Gaussian acyclic directed graphical model;We also propose a new class of multiscale models for stochastic processes in terms of scale-recursive dynamics defined on acyclic directed graphs. The models are an extension of multiscale tree-structured models. The optimal prediction can be obtained using the newly developed generalized Kalman-filter algorithm referred to above, and the parameters can be estimated by maximum likelihood via the EM algorithm. A subclass of these models are multiscale wavelet models, for which we show that the optimal predictors of hidden state variables can be obtained by a level-dependent (scale-dependent) wavelet shrinkage rule;In a series of papers, D. Donoho and I. Johnstone develop wavelet shrinkage methods to solve statistical problems. We propose a new rationale for wavelet shrinkage, based on the assumption that the underlying process can be decomposed into a large-scale deterministic trend plus a small-scale Gaussian process. Our approach has several advantages over current shrinkage methods. It takes the dependencies of empirical wavelet coefficients, both within scales and across scales, into account. Moreover, it does not rely on asymptotic properties for its justification so that it is also appropriate when the sample size is small;Finally, we introduce partially ordered Markov models, which are acyclic directed graphical models for spatial problems. The model can be regarded as a Markov random field with neighborhood structures derivable from an associated partially ordered set. We use a martingale approach to derive the asymptotic properties of maximum (composite) likelihood estimators for partially ordered Markov models. We prove that the maximum (composite) likelihood estimators are consistent, asymptotically normal, and also asymptotically efficient under checkable conditions

    Spatial Joint Species Distribution Modeling using Dirichlet Processes

    Full text link
    Species distribution models usually attempt to explain presence-absence or abundance of a species at a site in terms of the environmental features (socalled abiotic features) present at the site. Historically, such models have considered species individually. However, it is well-established that species interact to influence presence-absence and abundance (envisioned as biotic factors). As a result, there has been substantial recent interest in joint species distribution models with various types of response, e.g., presence-absence, continuous and ordinal data. Such models incorporate dependence between species response as a surrogate for interaction. The challenge we focus on here is how to address such modeling in the context of a large number of species (e.g., order 102) across sites numbering in the order of 102 or 103 when, in practice, only a few species are found at any observed site. Again, there is some recent literature to address this; we adopt a dimension reduction approach. The novel wrinkle we add here is spatial dependence. That is, we have a collection of sites over a relatively small spatial region so it is anticipated that species distribution at a given site would be similar to that at a nearby site. Specifically, we handle dimension reduction through Dirichlet processes joined with spatial dependence through Gaussian processes. We use both simulated data and a plant communities dataset for the Cape Floristic Region (CFR) of South Africa to demonstrate our approach. The latter consists of presence-absence measurements for 639 tree species on 662 locations. Through both data examples we are able to demonstrate improved predictive performance using the foregoing specification

    Wavelet regression estimation in nonparametric mixed effect models

    Get PDF
    AbstractWe show that a nonparametric estimator of a regression function, obtained as solution of a specific regularization problem is the best linear unbiased predictor in some nonparametric mixed effect model. Since this estimator is intractable from a numerical point of view, we propose a tight approximation of it easy and fast to implement. This second estimator achieves the usual optimal rate of convergence of the mean integrated squared error over a Sobolev class both for equispaced and nonequispaced design. Numerical experiments are presented both on simulated and ERP real data
    corecore