In-memory, distributed content-based recommender system

Audenaert, P.; De Pessemier, Toon; Dooms, Simon; Fostier, Jan; Martens, Luc

In-memory, distributed content-based recommender system

Authors: P. Audenaert
Toon De Pessemier
Simon Dooms
Jan Fostier
Luc Martens
Publication date: 1 January 2014
Publisher: 'Springer Science and Business Media LLC'
Doi

Abstract

Burdened by their popularity, recommender systems increasingly take on larger datasets while they are expected to deliver high quality results within reasonable time. To meet these ever growing requirements, industrial recommender systems often turn to parallel hardware and distributed computing. While the MapReduce paradigm is generally accepted for massive parallel data processing, it often entails complex algorithm reorganization and suboptimal efficiency because mid-computation values are typically read from and written to hard disk. This work implements an in-memory, content-based recommendation algorithm and shows how it can be parallelized and efficiently distributed across many homogeneous machines in a distributed-memory environment. By focusing on data parallelism and carefully constructing the definition of work in the context of recommender systems, we are able to partition the complete calculation process into any number of independent and equally sized jobs. An empirically validated performance model is developed to predict parallel speedup and promises high efficiencies for realistic hardware configurations. For the MovieLens 10 M dataset we note efficiency values up to 71 % for a configuration of 200 computing nodes (eight cores per node)

Similar works

Full text

Open in the Core reader

Download PDF

Available Versions

Ghent University Academic Bibliography

oai:archive.ugent.be:5682650

Last time updated on 12/11/2016