19 research outputs found
Accelerated Parallel Non-conjugate Sampling for Bayesian Non-parametric Models
Inference of latent feature models in the Bayesian nonparametric setting is
generally difficult, especially in high dimensional settings, because it
usually requires proposing features from some prior distribution. In special
cases, where the integration is tractable, we could sample new feature
assignments according to a predictive likelihood. However, this still may not
be efficient in high dimensions. We present a novel method to accelerate the
mixing of latent variable model inference by proposing feature locations from
the data, as opposed to the prior. First, we introduce our accelerated feature
proposal mechanism that we will show is a valid Bayesian inference algorithm
and next we propose an approximate inference strategy to perform accelerated
inference in parallel. This sampling method is efficient for proper mixing of
the Markov chain Monte Carlo sampler, computationally attractive, and is
theoretically guaranteed to converge to the posterior distribution as its
limiting distribution.Comment: Previously known as "Accelerated Inference for Latent Variable
Models
Scalable Estimation of Dirichlet Process Mixture Models on Distributed Data
We consider the estimation of Dirichlet Process Mixture Models (DPMMs) in
distributed environments, where data are distributed across multiple computing
nodes. A key advantage of Bayesian nonparametric models such as DPMMs is that
they allow new components to be introduced on the fly as needed. This, however,
posts an important challenge to distributed estimation -- how to handle new
components efficiently and consistently. To tackle this problem, we propose a
new estimation method, which allows new components to be created locally in
individual computing nodes. Components corresponding to the same cluster will
be identified and merged via a probabilistic consolidation scheme. In this way,
we can maintain the consistency of estimation with very low communication cost.
Experiments on large real-world data sets show that the proposed method can
achieve high scalability in distributed and asynchronous environments without
compromising the mixing performance.Comment: This paper is published on IJCAI 2017.
https://www.ijcai.org/proceedings/2017/64
Asymptotically Exact, Embarrassingly Parallel MCMC
Communication costs, resulting from synchronization requirements during
learning, can greatly slow down many parallel machine learning algorithms. In
this paper, we present a parallel Markov chain Monte Carlo (MCMC) algorithm in
which subsets of data are processed independently, with very little
communication. First, we arbitrarily partition data onto multiple machines.
Then, on each machine, any classical MCMC method (e.g., Gibbs sampling) may be
used to draw samples from a posterior distribution given the data subset.
Finally, the samples from each machine are combined to form samples from the
full posterior. This embarrassingly parallel algorithm allows each machine to
act independently on a subset of the data (without communication) until the
final combination stage. We prove that our algorithm generates asymptotically
exact samples and empirically demonstrate its ability to parallelize burn-in
and sampling in several models
Counting People by Clustering Person Detector Outputs
Abstract We present a people counting system that estimates the number of people in a scene by employing a clustering scheme based on Dirichlet Process Mixture Models (DPMMs) which takes outputs of a person detector system as input. For each frame, we run a person detector on the frame, take its output as a set of detection areas and define a set of features based on spatial, color and temporal information for each detection. Then using these features, we cluster the detections using DPMMs and Gibbs sampling while having no restriction on the number of clusters, thus can estimate an arbitrary number of people or groups of people. We finally define a measure to calculate the actual number of people within each cluster to infer the final estimation of the number of people in the scene
Markov Switching
Markov switching models are a popular family of models that introduces
time-variation in the parameters in the form of their state- or regime-specific
values. Importantly, this time-variation is governed by a discrete-valued
latent stochastic process with limited memory. More specifically, the current
value of the state indicator is determined only by the value of the state
indicator from the previous period, thus the Markov property, and the
transition matrix. The latter characterizes the properties of the Markov
process by determining with what probability each of the states can be visited
next period, given the state in the current period. This setup decides on the
two main advantages of the Markov switching models. Namely, the estimation of
the probability of state occurrences in each of the sample periods by using
filtering and smoothing methods and the estimation of the state-specific
parameters. These two features open the possibility for improved
interpretations of the parameters associated with specific regimes combined
with the corresponding regime probabilities, as well as for improved
forecasting performance based on persistent regimes and parameters
characterizing them.Comment: Keywords: Transition Probabilities, Exogenous Markov Switching,
Infinite Hidden Markov Model, Endogenous Markov Switching, Markov Process,
Finite Mixture Model, Change-point Model, Non-homogeneous Markov Switching,
Time Series Analysis, Business Cycle Analysi
Markov Switching
Markov switching models are a popular family of models that introduces
time-variation in the parameters in the form of their state- or regime-specific
values. Importantly, this time-variation is governed by a discrete-valued
latent stochastic process with limited memory. More specifically, the current
value of the state indicator is determined only by the value of the state
indicator from the previous period, thus the Markov property, and the
transition matrix. The latter characterizes the properties of the Markov
process by determining with what probability each of the states can be visited
next period, given the state in the current period. This setup decides on the
two main advantages of the Markov switching models. Namely, the estimation of
the probability of state occurrences in each of the sample periods by using
filtering and smoothing methods and the estimation of the state-specific
parameters. These two features open the possibility for improved
interpretations of the parameters associated with specific regimes combined
with the corresponding regime probabilities, as well as for improved
forecasting performance based on persistent regimes and parameters
characterizing them.Comment: Keywords: Transition Probabilities, Exogenous Markov Switching,
Infinite Hidden Markov Model, Endogenous Markov Switching, Markov Process,
Finite Mixture Model, Change-point Model, Non-homogeneous Markov Switching,
Time Series Analysis, Business Cycle Analysi