145 research outputs found
A Computationally Efficient Projection-Based Approach for Spatial Generalized Linear Mixed Models
Inference for spatial generalized linear mixed models (SGLMMs) for
high-dimensional non-Gaussian spatial data is computationally intensive. The
computational challenge is due to the high-dimensional random effects and
because Markov chain Monte Carlo (MCMC) algorithms for these models tend to be
slow mixing. Moreover, spatial confounding inflates the variance of fixed
effect (regression coefficient) estimates. Our approach addresses both the
computational and confounding issues by replacing the high-dimensional spatial
random effects with a reduced-dimensional representation based on random
projections. Standard MCMC algorithms mix well and the reduced-dimensional
setting speeds up computations per iteration. We show, via simulated examples,
that Bayesian inference for this reduced-dimensional approach works well both
in terms of inference as well as prediction, our methods also compare favorably
to existing "reduced-rank" approaches. We also apply our methods to two real
world data examples, one on bird count data and the other classifying rock
types
Gaussian Process Morphable Models
Statistical shape models (SSMs) represent a class of shapes as a normal
distribution of point variations, whose parameters are estimated from example
shapes. Principal component analysis (PCA) is applied to obtain a
low-dimensional representation of the shape variation in terms of the leading
principal components. In this paper, we propose a generalization of SSMs,
called Gaussian Process Morphable Models (GPMMs). We model the shape variations
with a Gaussian process, which we represent using the leading components of its
Karhunen-Loeve expansion. To compute the expansion, we make use of an
approximation scheme based on the Nystrom method. The resulting model can be
seen as a continuous analogon of an SSM. However, while for SSMs the shape
variation is restricted to the span of the example data, with GPMMs we can
define the shape variation using any Gaussian process. For example, we can
build shape models that correspond to classical spline models, and thus do not
require any example data. Furthermore, Gaussian processes make it possible to
combine different models. For example, an SSM can be extended with a spline
model, to obtain a model that incorporates learned shape characteristics, but
is flexible enough to explain shapes that cannot be represented by the SSM. We
introduce a simple algorithm for fitting a GPMM to a surface or image. This
results in a non-rigid registration approach, whose regularization properties
are defined by a GPMM. We show how we can obtain different registration
schemes,including methods for multi-scale, spatially-varying or hybrid
registration, by constructing an appropriate GPMM. As our approach strictly
separates modelling from the fitting process, this is all achieved without
changes to the fitting algorithm. We show the applicability and versatility of
GPMMs on a clinical use case, where the goal is the model-based segmentation of
3D forearm images
Topics in High-Dimensional Statistics and the Analysis of Large Hyperspectral Images.
Advancement in imaging technology has made hyperspectral images gathered from remote sensing much more common. The high-dimensional nature of these large scale data coupled with wavelength and spatial dependency necessitates high-dimensional and efficient computation methods to address these issues while producing results that are concise and easy to understand. The thesis addresses these issues by examining high-dimensional methods in the context of hyperspectral image classification, unmixing and wavelength correlation estimation.
Chapter 2 re-examines the sparse Bayesian learning (SBL) of linear models in a high-dimensional setting with sparse signal. The hard-thresholded version of the SBL estimator, under orthogonal design, achieves non-asymptotic error rate that is comparable to LASSO. We also establish in the chapter that with high-probability the estimator recovers the sparsity structure of the signal. The ability to recover sparsity structures in high dimensional settings is crucial for unmixing with high-dimensional libraries in the next chapter. In Chapter 3, the thesis investigates the application of SBL on the task of linear/bilinear unmixing and classification of hyperspectral images. The proposed model in this chapter uses latent Markov random fields to classify pixels and account for the spatial dependence between pixels. In the proposed model, the pixels belonging to the same group share the same mixture of pure endmembers. The task of unmixing and classification are performed simultaneously, but this method does not address wavelength dependence. Chapter 4 is a natural extension of the previous chapter that contains the framework to account for both spatial and wavelength dependence in the unmixing of hyperspectral images. The classification of the images are performed using approximate spectral clustering while the unmixing task is performed in tandem with sparse wavelength concentration matrix estimation.PHDStatisticsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/135893/1/chye_1.pd
A Tensor Approach to Learning Mixed Membership Community Models
Community detection is the task of detecting hidden communities from observed
interactions. Guaranteed community detection has so far been mostly limited to
models with non-overlapping communities such as the stochastic block model. In
this paper, we remove this restriction, and provide guaranteed community
detection for a family of probabilistic network models with overlapping
communities, termed as the mixed membership Dirichlet model, first introduced
by Airoldi et al. This model allows for nodes to have fractional memberships in
multiple communities and assumes that the community memberships are drawn from
a Dirichlet distribution. Moreover, it contains the stochastic block model as a
special case. We propose a unified approach to learning these models via a
tensor spectral decomposition method. Our estimator is based on low-order
moment tensor of the observed network, consisting of 3-star counts. Our
learning method is fast and is based on simple linear algebraic operations,
e.g. singular value decomposition and tensor power iterations. We provide
guaranteed recovery of community memberships and model parameters and present a
careful finite sample analysis of our learning method. As an important special
case, our results match the best known scaling requirements for the
(homogeneous) stochastic block model
- …