Search CORE

2,609 research outputs found

Sparse Stochastic Inference for Latent Dirichlet allocation

Author: Blei David
Hoffman Matt
Mimno David
Publication venue
Publication date: 01/01/2012
Field of study

We present a hybrid algorithm for Bayesian topic models that combines the efficiency of sparse Gibbs sampling with the scalability of online stochastic inference. We used our algorithm to analyze a corpus of 1.2 million books (33 billion words) with thousands of topics. Our approach reduces the bias of variational inference and generalizes to many Bayesian hidden-variable models.Comment: Appears in Proceedings of the 29th International Conference on Machine Learning (ICML 2012

arXiv.org e-Print Archive

CiteSeerX

Princeton University Open Access Repository

An Empirical Study of Stochastic Variational Algorithms for the Beta Bernoulli Process

Author: Ghahramani Zoubin
Knowles David A.
Shah Amar
Publication venue
Publication date: 26/06/2015
Field of study

Stochastic variational inference (SVI) is emerging as the most promising candidate for scaling inference in Bayesian probabilistic models to large datasets. However, the performance of these methods has been assessed primarily in the context of Bayesian topic models, particularly latent Dirichlet allocation (LDA). Deriving several new algorithms, and using synthetic, image and genomic datasets, we investigate whether the understanding gleaned from LDA applies in the setting of sparse latent factor models, specifically beta process factor analysis (BPFA). We demonstrate that the big picture is consistent: using Gibbs sampling within SVI to maintain certain posterior dependencies is extremely effective. However, we find that different posterior dependencies are important in BPFA relative to LDA. Particularly, approximations able to model intra-local variable dependence perform best.Comment: ICML, 12 pages. Volume 37: Proceedings of The 32nd International Conference on Machine Learning, 201

arXiv.org e-Print Archive

CiteSeerX

Efficient Correlated Topic Modeling with Topic Embedding

Author: Berg-Kirkpatrick Taylor
He Junxian
Hu Zhiting
Huang Ying
Xing Eric P.
Publication venue
Publication date: 01/07/2017
Field of study

Correlated topic modeling has been limited to small model and problem sizes due to their high computational cost and poor scaling. In this paper, we propose a new model which learns compact topic embeddings and captures topic correlations through the closeness between the topic vectors. Our method enables efficient inference in the low-dimensional embedding space, reducing previous cubic or quadratic time complexity to linear w.r.t the topic size. We further speedup variational inference with a fast sampler to exploit sparsity of topic occurrence. Extensive experiments show that our approach is capable of handling model and data scales which are several orders of magnitude larger than existing correlation results, without sacrificing modeling quality by providing competitive or superior performance in document classification and retrieval.Comment: KDD 2017 oral. The first two authors contributed equall

arXiv.org e-Print Archive

Crossref