14 research outputs found
Incorporating Side Information in Probabilistic Matrix Factorization with Gaussian Processes
Probabilistic matrix factorization (PMF) is a powerful method for modeling
data associated with pairwise relationships, finding use in collaborative
filtering, computational biology, and document analysis, among other areas. In
many domains, there is additional information that can assist in prediction.
For example, when modeling movie ratings, we might know when the rating
occurred, where the user lives, or what actors appear in the movie. It is
difficult, however, to incorporate this side information into the PMF model. We
propose a framework for incorporating side information by coupling together
multiple PMF problems via Gaussian process priors. We replace scalar latent
features with functions that vary over the space of side information. The GP
priors on these functions require them to vary smoothly and share information.
We successfully use this new method to predict the scores of professional
basketball games, where side information about the venue and date of the game
are relevant for the outcome.Comment: 18 pages, 4 figures, Submitted to UAI 201
Hybrid Collaborative Filtering with Autoencoders
Collaborative Filtering aims at exploiting the feedback of users to provide
personalised recommendations. Such algorithms look for latent variables in a
large sparse matrix of ratings. They can be enhanced by adding side information
to tackle the well-known cold start problem. While Neu-ral Networks have
tremendous success in image and speech recognition, they have received less
attention in Collaborative Filtering. This is all the more surprising that
Neural Networks are able to discover latent variables in large and
heterogeneous datasets. In this paper, we introduce a Collaborative Filtering
Neural network architecture aka CFN which computes a non-linear Matrix
Factorization from sparse rating inputs and side information. We show
experimentally on the MovieLens and Douban dataset that CFN outper-forms the
state of the art and benefits from side information. We provide an
implementation of the algorithm as a reusable plugin for Torch, a popular
Neural Network framework
Distributed Bayesian Matrix Factorization with Limited Communication
Bayesian matrix factorization (BMF) is a powerful tool for producing low-rank
representations of matrices and for predicting missing values and providing
confidence intervals. Scaling up the posterior inference for massive-scale
matrices is challenging and requires distributing both data and computation
over many workers, making communication the main computational bottleneck.
Embarrassingly parallel inference would remove the communication needed, by
using completely independent computations on different data subsets, but it
suffers from the inherent unidentifiability of BMF solutions. We introduce a
hierarchical decomposition of the joint posterior distribution, which couples
the subset inferences, allowing for embarrassingly parallel computations in a
sequence of at most three stages. Using an efficient approximate
implementation, we show improvements empirically on both real and simulated
data. Our distributed approach is able to achieve a speed-up of almost an order
of magnitude over the full posterior, with a negligible effect on predictive
accuracy. Our method outperforms state-of-the-art embarrassingly parallel MCMC
methods in accuracy, and achieves results competitive to other available
distributed and parallel implementations of BMF.Comment: 28 pages, 8 figures. The paper is published in Machine Learning
journal. An implementation of the method is is available in SMURFF software
on github (bmfpp branch): https://github.com/ExaScience/smurf
A comprehensive view of recommendation methods based on probabilistic techniques
Esta investigación tiene como objetivo utilizar un método de recomendación hibrido
basado en técnicas probabilísticas y de modelado de tópicos que brinde al usuario recomendaciones
más ajustadas frente a los modelos de recomendación tradicionales. Este artículo presenta una
revisión comprensiva de los métodos de recomendación para sistemas basado en contenido y filtrado
colaborativo. Entre los métodos analizados están las Matrices de Factorización Probabilística y el
método de Asignación Latente de Dirichlet. La revisión de la literatura entorno a estos modelos se
centra en la identificación de problemas y cuestiones abiertas que pueden ser abarcadas para futuras
investigaciones. Se analiza el funcionamiento de algunos modelos de recomendación que integran
técnicas de factores latentes y de modelado de tópicos, que serán de base para comparar los
resultados obtenidos con el modelo híbrido.This research aims to use a hybrid recommendation method based on probabilistic
techniques and topics modeling that provide recommendations most close fitting the user compared
to other traditional recommendation models. We carry out a comprehensive review of the
recommended methods for content-based systems and collaborative filtering, mainly in the domain
of recommending movies. The methods discussed are the matrix factorization and Latent Dirichlet
Allocation method. The literature review around these models focuses on identifying problems and
open issues that may be covered for future researches. Also, we analyzed the recommendation
models that integrant latent factor methods and topics modeling, which will be used to compare
results obtained with the hybrid mode
PRONÓSTICO DE DEMANDA DE ENERGÍA ELÉCTRICA USANDO PROCESOS GAUSSIANOS: UN ANÁLISIS COMPARATIVO: Short-Term Load Demand Forecasting using Gaussian Processes: A Comparative Analysis
Abstract—Load demand forecasting is an essential component for planning power systems, and it is an invaluable tool to grid operators or customers. Many methods have been proposed to provide reliable estimates of electric load demand, but few methods can address the problem of predicting energy demand from a probabilistic point of view. One of them is the Gaussian processes (GP) that considering an adequate covariance function are suitable tools to carry out this load forecasting task. In this article, we show how to use Gaussian processes to predict elec- trical energy demand. Additionally, we thoroughly test various covariance functions and provide a new one. The performance of the proposed methodology was tested on two real data sets, showing that GPs are competitive alternatives for short-term load demand forecasting compared to other state-of-the-art method
Active learning and search on low-rank matrices
Collaborative prediction is a powerful technique, useful in domains from recommender systems to guiding the scien-tific discovery process. Low-rank matrix factorization is one of the most powerful tools for collaborative prediction. This work presents a general approach for active collabora-tive prediction with the Probabilistic Matrix Factorization model. Using variational approximations or Markov chain Monte Carlo sampling to estimate the posterior distribution over models, we can choose query points to maximize our un-derstanding of the model, to best predict unknown elements of the data matrix, or to find as many “positive ” data points as possible. We evaluate our methods on simulated data, and also show their applicability to movie ratings prediction and the discovery of drug-target interactions