This note shows how to integrate out the multinomial parameters for latent Dirichlet allocation (LDA) and naive Bayes (NB) models. This allows us to perform Gibbs sampling without taking multinomial parameter samples. Although the conjugacy of the Dirichlet priors makes sampling the multinomial parameters relatively straightforward, sampling on a topic-by-topic basis provides two advantages. First, it means that all samples are drawn from simple discrete distributions with easily calculated parameters. Second, and more importantly, collapsing supports fully stochastic Gibbs sampling where the model is updated after each word (in LDA) or document (in NB) is assigned a topic. Typically, more stochastic sampling leads to quicker convergence to the stationary state of the Markov chain made up of the Gibbs samples. Both the LDA and NB models are topic models, where words are generated based on topic-specific multinomials. The main difference is that LDA assumes each word in a document is drawn from a mixture of topics, whereas NB assumes each word in a document is drawn from a single topic. In a hierarchical model, the topic and word priors would themselves be estimated. Here, we assume the priors are fixed hyperparameters in both the NB and LDA models. 1 LDA Mode
To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.