14 research outputs found

    Incorporating Side Information in Probabilistic Matrix Factorization with Gaussian Processes

    Get PDF
    Probabilistic matrix factorization (PMF) is a powerful method for modeling data associated with pairwise relationships, finding use in collaborative filtering, computational biology, and document analysis, among other areas. In many domains, there is additional information that can assist in prediction. For example, when modeling movie ratings, we might know when the rating occurred, where the user lives, or what actors appear in the movie. It is difficult, however, to incorporate this side information into the PMF model. We propose a framework for incorporating side information by coupling together multiple PMF problems via Gaussian process priors. We replace scalar latent features with functions that vary over the space of side information. The GP priors on these functions require them to vary smoothly and share information. We successfully use this new method to predict the scores of professional basketball games, where side information about the venue and date of the game are relevant for the outcome.Comment: 18 pages, 4 figures, Submitted to UAI 201

    Hybrid Collaborative Filtering with Autoencoders

    Get PDF
    Collaborative Filtering aims at exploiting the feedback of users to provide personalised recommendations. Such algorithms look for latent variables in a large sparse matrix of ratings. They can be enhanced by adding side information to tackle the well-known cold start problem. While Neu-ral Networks have tremendous success in image and speech recognition, they have received less attention in Collaborative Filtering. This is all the more surprising that Neural Networks are able to discover latent variables in large and heterogeneous datasets. In this paper, we introduce a Collaborative Filtering Neural network architecture aka CFN which computes a non-linear Matrix Factorization from sparse rating inputs and side information. We show experimentally on the MovieLens and Douban dataset that CFN outper-forms the state of the art and benefits from side information. We provide an implementation of the algorithm as a reusable plugin for Torch, a popular Neural Network framework

    Distributed Bayesian Matrix Factorization with Limited Communication

    Full text link
    Bayesian matrix factorization (BMF) is a powerful tool for producing low-rank representations of matrices and for predicting missing values and providing confidence intervals. Scaling up the posterior inference for massive-scale matrices is challenging and requires distributing both data and computation over many workers, making communication the main computational bottleneck. Embarrassingly parallel inference would remove the communication needed, by using completely independent computations on different data subsets, but it suffers from the inherent unidentifiability of BMF solutions. We introduce a hierarchical decomposition of the joint posterior distribution, which couples the subset inferences, allowing for embarrassingly parallel computations in a sequence of at most three stages. Using an efficient approximate implementation, we show improvements empirically on both real and simulated data. Our distributed approach is able to achieve a speed-up of almost an order of magnitude over the full posterior, with a negligible effect on predictive accuracy. Our method outperforms state-of-the-art embarrassingly parallel MCMC methods in accuracy, and achieves results competitive to other available distributed and parallel implementations of BMF.Comment: 28 pages, 8 figures. The paper is published in Machine Learning journal. An implementation of the method is is available in SMURFF software on github (bmfpp branch): https://github.com/ExaScience/smurf

    A comprehensive view of recommendation methods based on probabilistic techniques

    Get PDF
    Esta investigación tiene como objetivo utilizar un método de recomendación hibrido basado en técnicas probabilísticas y de modelado de tópicos que brinde al usuario recomendaciones más ajustadas frente a los modelos de recomendación tradicionales. Este artículo presenta una revisión comprensiva de los métodos de recomendación para sistemas basado en contenido y filtrado colaborativo. Entre los métodos analizados están las Matrices de Factorización Probabilística y el método de Asignación Latente de Dirichlet. La revisión de la literatura entorno a estos modelos se centra en la identificación de problemas y cuestiones abiertas que pueden ser abarcadas para futuras investigaciones. Se analiza el funcionamiento de algunos modelos de recomendación que integran técnicas de factores latentes y de modelado de tópicos, que serán de base para comparar los resultados obtenidos con el modelo híbrido.This research aims to use a hybrid recommendation method based on probabilistic techniques and topics modeling that provide recommendations most close fitting the user compared to other traditional recommendation models. We carry out a comprehensive review of the recommended methods for content-based systems and collaborative filtering, mainly in the domain of recommending movies. The methods discussed are the matrix factorization and Latent Dirichlet Allocation method. The literature review around these models focuses on identifying problems and open issues that may be covered for future researches. Also, we analyzed the recommendation models that integrant latent factor methods and topics modeling, which will be used to compare results obtained with the hybrid mode

    PRONÓSTICO DE DEMANDA DE ENERGÍA ELÉCTRICA USANDO PROCESOS GAUSSIANOS: UN ANÁLISIS COMPARATIVO: Short-Term Load Demand Forecasting using Gaussian Processes: A Comparative Analysis

    Get PDF
    Abstract—Load demand forecasting is an essential component for planning power systems, and it is an invaluable tool to grid operators or customers. Many methods have been proposed to provide reliable estimates of electric load demand, but few methods can address the problem of predicting energy demand from a probabilistic point of view. One of them is the Gaussian processes (GP) that considering an adequate covariance function are suitable tools to carry out this load forecasting task. In this article, we show how to use Gaussian processes to predict elec- trical energy demand. Additionally, we thoroughly test various covariance functions and provide a new one. The performance of the proposed methodology was tested on two real data sets, showing that GPs are competitive alternatives for short-term load demand forecasting compared to other state-of-the-art method

    Active learning and search on low-rank matrices

    Full text link
    Collaborative prediction is a powerful technique, useful in domains from recommender systems to guiding the scien-tific discovery process. Low-rank matrix factorization is one of the most powerful tools for collaborative prediction. This work presents a general approach for active collabora-tive prediction with the Probabilistic Matrix Factorization model. Using variational approximations or Markov chain Monte Carlo sampling to estimate the posterior distribution over models, we can choose query points to maximize our un-derstanding of the model, to best predict unknown elements of the data matrix, or to find as many “positive ” data points as possible. We evaluate our methods on simulated data, and also show their applicability to movie ratings prediction and the discovery of drug-target interactions
    corecore