14,958 research outputs found
A Bayesian space–time model for clustering areal units based on their disease trends
Population-level disease risk across a set of non-overlapping areal units varies in space and time, and a large research literature has developed methodology for identifying clusters of areal units exhibiting elevated risks. However, almost no research has extended the clustering paradigm to identify groups of areal units exhibiting similar temporal disease trends. We present a novel Bayesian hierarchical mixture model for achieving this goal, with inference based on a Metropolis-coupled Markov chain Monte Carlo ((MC)
3
) algorithm. The effectiveness of the (MC)
3
algorithm compared to a standard Markov chain Monte Carlo implementation is demonstrated in a simulation study, and the methodology is motivated by two important case studies in the United Kingdom. The first concerns the impact on measles susceptibility of the discredited paper linking the measles, mumps, and rubella vaccination to an increased risk of Autism and investigates whether all areas in the Scotland were equally affected. The second concerns respiratory hospitalizations and investigates over a 10 year period which parts of Glasgow have shown increased, decreased, and no change in risk
Feature discovery and visualization of robot mission data using convolutional autoencoders and Bayesian nonparametric topic models
The gap between our ability to collect interesting data and our ability to
analyze these data is growing at an unprecedented rate. Recent algorithmic
attempts to fill this gap have employed unsupervised tools to discover
structure in data. Some of the most successful approaches have used
probabilistic models to uncover latent thematic structure in discrete data.
Despite the success of these models on textual data, they have not generalized
as well to image data, in part because of the spatial and temporal structure
that may exist in an image stream.
We introduce a novel unsupervised machine learning framework that
incorporates the ability of convolutional autoencoders to discover features
from images that directly encode spatial information, within a Bayesian
nonparametric topic model that discovers meaningful latent patterns within
discrete data. By using this hybrid framework, we overcome the fundamental
dependency of traditional topic models on rigidly hand-coded data
representations, while simultaneously encoding spatial dependency in our topics
without adding model complexity. We apply this model to the motivating
application of high-level scene understanding and mission summarization for
exploratory marine robots. Our experiments on a seafloor dataset collected by a
marine robot show that the proposed hybrid framework outperforms current
state-of-the-art approaches on the task of unsupervised seafloor terrain
characterization.Comment: 8 page
Spatial clustering of average risks and risk trends in Bayesian disease mapping
Spatiotemporal disease mapping focuses on estimating the spatial pattern in disease risk across a set of nonoverlapping areal units over a fixed period of time. The key aim of such research is to identify areas that have a high average level of disease risk or where disease risk is increasing over time, thus allowing public health interventions to be focused on these areas. Such aims are well suited to the statistical approach of clustering, and while much research has been done in this area in a purely spatial setting, only a handful of approaches have focused on spatiotemporal clustering of disease risk. Therefore, this paper outlines a new modeling approach for clustering spatiotemporal disease risk data, by clustering areas based on both their mean risk levels and the behavior of their temporal trends. The efficacy of the methodology is established by a simulation study, and is illustrated by a study of respiratory disease risk in Glasgow, Scotland
Bayesian nonparametric models for spatially indexed data of mixed type
We develop Bayesian nonparametric models for spatially indexed data of mixed
type. Our work is motivated by challenges that occur in environmental
epidemiology, where the usual presence of several confounding variables that
exhibit complex interactions and high correlations makes it difficult to
estimate and understand the effects of risk factors on health outcomes of
interest. The modeling approach we adopt assumes that responses and confounding
variables are manifestations of continuous latent variables, and uses
multivariate Gaussians to jointly model these. Responses and confounding
variables are not treated equally as relevant parameters of the distributions
of the responses only are modeled in terms of explanatory variables or risk
factors. Spatial dependence is introduced by allowing the weights of the
nonparametric process priors to be location specific, obtained as probit
transformations of Gaussian Markov random fields. Confounding variables and
spatial configuration have a similar role in the model, in that they only
influence, along with the responses, the allocation probabilities of the areas
into the mixture components, thereby allowing for flexible adjustment of the
effects of observed confounders, while allowing for the possibility of residual
spatial structure, possibly occurring due to unmeasured or undiscovered
spatially varying factors. Aspects of the model are illustrated in simulation
studies and an application to a real data set
Modeling and estimation of multi-source clustering in crime and security data
While the presence of clustering in crime and security event data is well
established, the mechanism(s) by which clustering arises is not fully
understood. Both contagion models and history independent correlation models
are applied, but not simultaneously. In an attempt to disentangle contagion
from other types of correlation, we consider a Hawkes process with background
rate driven by a log Gaussian Cox process. Our inference methodology is an
efficient Metropolis adjusted Langevin algorithm for filtering of the intensity
and estimation of the model parameters. We apply the methodology to property
and violent crime data from Chicago, terrorist attack data from Northern
Ireland and Israel, and civilian casualty data from Iraq. For each data set we
quantify the uncertainty in the levels of contagion vs. history independent
correlation.Comment: Published in at http://dx.doi.org/10.1214/13-AOAS647 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
- …