6,280 research outputs found
Scalable Recommendation with Poisson Factorization
We develop a Bayesian Poisson matrix factorization model for forming
recommendations from sparse user behavior data. These data are large user/item
matrices where each user has provided feedback on only a small subset of items,
either explicitly (e.g., through star ratings) or implicitly (e.g., through
views or purchases). In contrast to traditional matrix factorization
approaches, Poisson factorization implicitly models each user's limited
attention to consume items. Moreover, because of the mathematical form of the
Poisson likelihood, the model needs only to explicitly consider the observed
entries in the matrix, leading to both scalable computation and good predictive
performance. We develop a variational inference algorithm for approximate
posterior inference that scales up to massive data sets. This is an efficient
algorithm that iterates over the observed entries and adjusts an approximate
posterior over the user/item representations. We apply our method to large
real-world user data containing users rating movies, users listening to songs,
and users reading scientific papers. In all these settings, Bayesian Poisson
factorization outperforms state-of-the-art matrix factorization methods
Scalable Bayesian Non-Negative Tensor Factorization for Massive Count Data
We present a Bayesian non-negative tensor factorization model for
count-valued tensor data, and develop scalable inference algorithms (both batch
and online) for dealing with massive tensors. Our generative model can handle
overdispersed counts as well as infer the rank of the decomposition. Moreover,
leveraging a reparameterization of the Poisson distribution as a multinomial
facilitates conjugacy in the model and enables simple and efficient Gibbs
sampling and variational Bayes (VB) inference updates, with a computational
cost that only depends on the number of nonzeros in the tensor. The model also
provides a nice interpretability for the factors; in our model, each factor
corresponds to a "topic". We develop a set of online inference algorithms that
allow further scaling up the model to massive tensors, for which batch
inference methods may be infeasible. We apply our framework on diverse
real-world applications, such as \emph{multiway} topic modeling on a scientific
publications database, analyzing a political science data set, and analyzing a
massive household transactions data set.Comment: ECML PKDD 201
- …