184 research outputs found
Hierarchical Compound Poisson Factorization
Non-negative matrix factorization models based on a hierarchical
Gamma-Poisson structure capture user and item behavior effectively in extremely
sparse data sets, making them the ideal choice for collaborative filtering
applications. Hierarchical Poisson factorization (HPF) in particular has proved
successful for scalable recommendation systems with extreme sparsity. HPF,
however, suffers from a tight coupling of sparsity model (absence of a rating)
and response model (the value of the rating), which limits the expressiveness
of the latter. Here, we introduce hierarchical compound Poisson factorization
(HCPF) that has the favorable Gamma-Poisson structure and scalability of HPF to
high-dimensional extremely sparse matrices. More importantly, HCPF decouples
the sparsity model from the response model, allowing us to choose the most
suitable distribution for the response. HCPF can capture binary, non-negative
discrete, non-negative continuous, and zero-inflated continuous responses. We
compare HCPF with HPF on nine discrete and three continuous data sets and
conclude that HCPF captures the relationship between sparsity and response
better than HPF.Comment: Will appear on Proceedings of the 33 rd International Conference on
Machine Learning, New York, NY, USA, 2016. JMLR: W&CP volume 4
Expandable Factor Analysis
Bayesian sparse factor models have proven useful for characterizing
dependence in multivariate data, but scaling computation to large numbers of
samples and dimensions is problematic. We propose expandable factor analysis
for scalable inference in factor models when the number of factors is unknown.
The method relies on a continuous shrinkage prior for efficient maximum a
posteriori estimation of a low-rank and sparse loadings matrix. The structure
of the prior leads to an estimation algorithm that accommodates uncertainty in
the number of factors. We propose an information criterion to select the
hyperparameters of the prior. Expandable factor analysis has better false
discovery rates and true positive rates than its competitors across diverse
simulations. We apply the proposed approach to a gene expression study of aging
in mice, illustrating superior results relative to four competing methods.Comment: 28 pages, 4 figure
- …