19 research outputs found

    Accelerated Parallel Non-conjugate Sampling for Bayesian Non-parametric Models

    Full text link
    Inference of latent feature models in the Bayesian nonparametric setting is generally difficult, especially in high dimensional settings, because it usually requires proposing features from some prior distribution. In special cases, where the integration is tractable, we could sample new feature assignments according to a predictive likelihood. However, this still may not be efficient in high dimensions. We present a novel method to accelerate the mixing of latent variable model inference by proposing feature locations from the data, as opposed to the prior. First, we introduce our accelerated feature proposal mechanism that we will show is a valid Bayesian inference algorithm and next we propose an approximate inference strategy to perform accelerated inference in parallel. This sampling method is efficient for proper mixing of the Markov chain Monte Carlo sampler, computationally attractive, and is theoretically guaranteed to converge to the posterior distribution as its limiting distribution.Comment: Previously known as "Accelerated Inference for Latent Variable Models

    Scalable Estimation of Dirichlet Process Mixture Models on Distributed Data

    Full text link
    We consider the estimation of Dirichlet Process Mixture Models (DPMMs) in distributed environments, where data are distributed across multiple computing nodes. A key advantage of Bayesian nonparametric models such as DPMMs is that they allow new components to be introduced on the fly as needed. This, however, posts an important challenge to distributed estimation -- how to handle new components efficiently and consistently. To tackle this problem, we propose a new estimation method, which allows new components to be created locally in individual computing nodes. Components corresponding to the same cluster will be identified and merged via a probabilistic consolidation scheme. In this way, we can maintain the consistency of estimation with very low communication cost. Experiments on large real-world data sets show that the proposed method can achieve high scalability in distributed and asynchronous environments without compromising the mixing performance.Comment: This paper is published on IJCAI 2017. https://www.ijcai.org/proceedings/2017/64

    Asymptotically Exact, Embarrassingly Parallel MCMC

    Full text link
    Communication costs, resulting from synchronization requirements during learning, can greatly slow down many parallel machine learning algorithms. In this paper, we present a parallel Markov chain Monte Carlo (MCMC) algorithm in which subsets of data are processed independently, with very little communication. First, we arbitrarily partition data onto multiple machines. Then, on each machine, any classical MCMC method (e.g., Gibbs sampling) may be used to draw samples from a posterior distribution given the data subset. Finally, the samples from each machine are combined to form samples from the full posterior. This embarrassingly parallel algorithm allows each machine to act independently on a subset of the data (without communication) until the final combination stage. We prove that our algorithm generates asymptotically exact samples and empirically demonstrate its ability to parallelize burn-in and sampling in several models

    Counting People by Clustering Person Detector Outputs

    Get PDF
    Abstract We present a people counting system that estimates the number of people in a scene by employing a clustering scheme based on Dirichlet Process Mixture Models (DPMMs) which takes outputs of a person detector system as input. For each frame, we run a person detector on the frame, take its output as a set of detection areas and define a set of features based on spatial, color and temporal information for each detection. Then using these features, we cluster the detections using DPMMs and Gibbs sampling while having no restriction on the number of clusters, thus can estimate an arbitrary number of people or groups of people. We finally define a measure to calculate the actual number of people within each cluster to infer the final estimation of the number of people in the scene

    Markov Switching

    Full text link
    Markov switching models are a popular family of models that introduces time-variation in the parameters in the form of their state- or regime-specific values. Importantly, this time-variation is governed by a discrete-valued latent stochastic process with limited memory. More specifically, the current value of the state indicator is determined only by the value of the state indicator from the previous period, thus the Markov property, and the transition matrix. The latter characterizes the properties of the Markov process by determining with what probability each of the states can be visited next period, given the state in the current period. This setup decides on the two main advantages of the Markov switching models. Namely, the estimation of the probability of state occurrences in each of the sample periods by using filtering and smoothing methods and the estimation of the state-specific parameters. These two features open the possibility for improved interpretations of the parameters associated with specific regimes combined with the corresponding regime probabilities, as well as for improved forecasting performance based on persistent regimes and parameters characterizing them.Comment: Keywords: Transition Probabilities, Exogenous Markov Switching, Infinite Hidden Markov Model, Endogenous Markov Switching, Markov Process, Finite Mixture Model, Change-point Model, Non-homogeneous Markov Switching, Time Series Analysis, Business Cycle Analysi

    Markov Switching

    Get PDF
    Markov switching models are a popular family of models that introduces time-variation in the parameters in the form of their state- or regime-specific values. Importantly, this time-variation is governed by a discrete-valued latent stochastic process with limited memory. More specifically, the current value of the state indicator is determined only by the value of the state indicator from the previous period, thus the Markov property, and the transition matrix. The latter characterizes the properties of the Markov process by determining with what probability each of the states can be visited next period, given the state in the current period. This setup decides on the two main advantages of the Markov switching models. Namely, the estimation of the probability of state occurrences in each of the sample periods by using filtering and smoothing methods and the estimation of the state-specific parameters. These two features open the possibility for improved interpretations of the parameters associated with specific regimes combined with the corresponding regime probabilities, as well as for improved forecasting performance based on persistent regimes and parameters characterizing them.Comment: Keywords: Transition Probabilities, Exogenous Markov Switching, Infinite Hidden Markov Model, Endogenous Markov Switching, Markov Process, Finite Mixture Model, Change-point Model, Non-homogeneous Markov Switching, Time Series Analysis, Business Cycle Analysi
    corecore