Search CORE

1,409 research outputs found

Fast Differentially Private Matrix Factorization

Author: Ahn S.
Chen T.
Ding N.
Hartstein A.
Keshavan R.
Kyrola A.
Marsaglia G.
Meka R.
Mir D. J.
Neal R. M.
Niu F.
Sato I.
Srebro N.
Wang Y.-X.
Wang Y.-X.
Welling M.
Xin Y.
Zhao H.
Publication venue
Publication date: 07/05/2015
Field of study

Differentially private collaborative filtering is a challenging task, both in terms of accuracy and speed. We present a simple algorithm that is provably differentially private, while offering good performance, using a novel connection of differential privacy to Bayesian posterior sampling via Stochastic Gradient Langevin Dynamics. Due to its simplicity the algorithm lends itself to efficient implementation. By careful systems design and by exploiting the power law behavior of the data to maximize CPU cache bandwidth we are able to generate 1024 dimensional models at a rate of 8.5 million recommendations per second on a single PC

arXiv.org e-Print Archive

Crossref

Large-Scale Distributed Bayesian Matrix Factorization using Stochastic Gradient MCMC

Author: Adams R.
Ahn S.
Ahn S.
Bardenet R.
Bennett J.
Bezanson J.
Chen T.
Ding N.
Dror G.
Girolami M.
Hall K. B.
Korattikara A.
Mann G.
McDonald R.
Mnih A.
Neal R.
Patterson S.
Porteous I.
Rossky P.
Welling M.
Zinkevich M.
Publication venue
Publication date: 01/01/2015
Field of study

Despite having various attractive qualities such as high prediction accuracy and the ability to quantify uncertainty and avoid over-fitting, Bayesian Matrix Factorization has not been widely adopted because of the prohibitive cost of inference. In this paper, we propose a scalable distributed Bayesian matrix factorization algorithm using stochastic gradient MCMC. Our algorithm, based on Distributed Stochastic Gradient Langevin Dynamics, can not only match the prediction accuracy of standard MCMC methods like Gibbs sampling, but at the same time is as fast and simple as stochastic gradient descent. In our experiments, we show that our algorithm can achieve the same level of prediction accuracy as Gibbs sampling an order of magnitude faster. We also show that our method reduces the prediction error as fast as distributed stochastic gradient descent, achieving a 4.1% improvement in RMSE for the Netflix dataset and an 1.8% for the Yahoo music dataset

arXiv.org e-Print Archive

Crossref

UvA-DARE

International Migration, Integration and Social Cohesion online publications

Scalable and interpretable product recommendations via overlapping co-clustering

Author: Dünner Celestine
Heckel Reinhard
Parnell Thomas
Vlachos Michail
Publication venue
Publication date: 01/04/2017
Field of study

We consider the problem of generating interpretable recommendations by identifying overlapping co-clusters of clients and products, based only on positive or implicit feedback. Our approach is applicable on very large datasets because it exhibits almost linear complexity in the input examples and the number of co-clusters. We show, both on real industrial data and on publicly available datasets, that the recommendation accuracy of our algorithm is competitive to that of state-of-art matrix factorization techniques. In addition, our technique has the advantage of offering recommendations that are textually and visually interpretable. Finally, we examine how to implement our technique efficiently on Graphical Processing Units (GPUs).Comment: In IEEE International Conference on Data Engineering (ICDE) 201

arXiv.org e-Print Archive

Crossref

Serveur académique lausannois