2,603 research outputs found
Streaming, Distributed Variational Inference for Bayesian Nonparametrics
This paper presents a methodology for creating streaming, distributed
inference algorithms for Bayesian nonparametric (BNP) models. In the proposed
framework, processing nodes receive a sequence of data minibatches, compute a
variational posterior for each, and make asynchronous streaming updates to a
central model. In contrast to previous algorithms, the proposed framework is
truly streaming, distributed, asynchronous, learning-rate-free, and
truncation-free. The key challenge in developing the framework, arising from
the fact that BNP models do not impose an inherent ordering on their
components, is finding the correspondence between minibatch and central BNP
posterior components before performing each update. To address this, the paper
develops a combinatorial optimization problem over component correspondences,
and provides an efficient solution technique. The paper concludes with an
application of the methodology to the DP mixture model, with experimental
results demonstrating its practical scalability and performance.Comment: This paper was presented at NIPS 2015. Please use the following
BibTeX citation: @inproceedings{Campbell15_NIPS, Author = {Trevor Campbell
and Julian Straub and John W. {Fisher III} and Jonathan P. How}, Title =
{Streaming, Distributed Variational Inference for Bayesian Nonparametrics},
Booktitle = {Advances in Neural Information Processing Systems (NIPS)}, Year
= {2015}
The Discrete Infinite Logistic Normal Distribution
We present the discrete infinite logistic normal distribution (DILN), a
Bayesian nonparametric prior for mixed membership models. DILN is a
generalization of the hierarchical Dirichlet process (HDP) that models
correlation structure between the weights of the atoms at the group level. We
derive a representation of DILN as a normalized collection of gamma-distributed
random variables, and study its statistical properties. We consider
applications to topic modeling and derive a variational inference algorithm for
approximate posterior inference. We study the empirical performance of the DILN
topic model on four corpora, comparing performance with the HDP and the
correlated topic model (CTM). To deal with large-scale data sets, we also
develop an online inference algorithm for DILN and compare with online HDP and
online LDA on the Nature magazine, which contains approximately 350,000
articles.Comment: This paper will appear in Bayesian Analysis. A shorter version of
this paper appeared at AISTATS 2011, Fort Lauderdale, FL, US
A nonparametric Bayesian approach toward robot learning by demonstration
In the past years, many authors have considered application of machine learning methodologies to effect robot learning by demonstration. Gaussian mixture regression (GMR) is one of the most successful methodologies used for this purpose. A major limitation of GMR models concerns automatic selection of the proper number of model states, i.e., the number of model component densities. Existing methods, including likelihood- or entropy-based criteria, usually tend to yield noisy model size estimates while imposing heavy computational requirements. Recently, Dirichlet process (infinite) mixture models have emerged in the cornerstone of nonparametric Bayesian statistics as promising candidates for clustering applications where the number of clusters is unknown a priori. Under this motivation, to resolve the aforementioned issues of GMR-based methods for robot learning by demonstration, in this paper we introduce a nonparametric Bayesian formulation for the GMR model, the Dirichlet process GMR model. We derive an efficient variational Bayesian inference algorithm for the proposed model, and we experimentally investigate its efficacy as a robot learning by demonstration methodology, considering a number of demanding robot learning by demonstration scenarios
Stochastic Variational Inference
We develop stochastic variational inference, a scalable algorithm for
approximating posterior distributions. We develop this technique for a large
class of probabilistic models and we demonstrate it with two probabilistic
topic models, latent Dirichlet allocation and the hierarchical Dirichlet
process topic model. Using stochastic variational inference, we analyze several
large collections of documents: 300K articles from Nature, 1.8M articles from
The New York Times, and 3.8M articles from Wikipedia. Stochastic inference can
easily handle data sets of this size and outperforms traditional variational
inference, which can only handle a smaller subset. (We also show that the
Bayesian nonparametric topic model outperforms its parametric counterpart.)
Stochastic variational inference lets us apply complex Bayesian models to
massive data sets
A trust-region method for stochastic variational inference with applications to streaming data
Stochastic variational inference allows for fast posterior inference in
complex Bayesian models. However, the algorithm is prone to local optima which
can make the quality of the posterior approximation sensitive to the choice of
hyperparameters and initialization. We address this problem by replacing the
natural gradient step of stochastic varitional inference with a trust-region
update. We show that this leads to generally better results and reduced
sensitivity to hyperparameters. We also describe a new strategy for variational
inference on streaming data and show that here our trust-region method is
crucial for getting good performance.Comment: in Proceedings of the 32nd International Conference on Machine
Learning, 201
Event detection in location-based social networks
With the advent of social networks and the rise of mobile technologies, users have become ubiquitous sensors capable of monitoring various real-world events in a crowd-sourced manner. Location-based social networks have proven to be faster than traditional media channels in reporting and geo-locating breaking news, i.e. Osama Bin Laden’s death was first confirmed on Twitter even before the announcement from the communication department at the White House. However, the deluge of user-generated data on these networks requires intelligent systems capable of identifying and characterizing such events in a comprehensive manner. The data mining community coined the term, event detection , to refer to the task of uncovering emerging patterns in data streams . Nonetheless, most data mining techniques do not reproduce the underlying data generation process, hampering to self-adapt in fast-changing scenarios. Because of this, we propose a probabilistic machine learning approach to event detection which explicitly models the data generation process and enables reasoning about the discovered events. With the aim to set forth the differences between both approaches, we present two techniques for the problem of event detection in Twitter : a data mining technique called Tweet-SCAN and a machine learning technique called Warble. We assess and compare both techniques in a dataset of tweets geo-located in the city of Barcelona during its annual festivities. Last but not least, we present the algorithmic changes and data processing frameworks to scale up the proposed techniques to big data workloads.This work is partially supported by Obra Social “la Caixa”, by the Spanish Ministry of Science and Innovation under contract (TIN2015-65316), by the Severo Ochoa Program (SEV2015-0493), by SGR programs of the Catalan Government (2014-SGR-1051, 2014-SGR-118), Collectiveware (TIN2015-66863-C2-1-R) and BSC/UPC NVIDIA GPU Center of Excellence.We would also like to thank the reviewers for their constructive feedback.Peer ReviewedPostprint (author's final draft
- …