6 research outputs found

    Statistical Significance of the Netflix Challenge

    Full text link
    Inspired by the legacy of the Netflix contest, we provide an overview of what has been learned---from our own efforts, and those of others---concerning the problems of collaborative filtering and recommender systems. The data set consists of about 100 million movie ratings (from 1 to 5 stars) involving some 480 thousand users and some 18 thousand movies; the associated ratings matrix is about 99% sparse. The goal is to predict ratings that users will give to movies; systems which can do this accurately have significant commercial applications, particularly on the world wide web. We discuss, in some detail, approaches to "baseline" modeling, singular value decomposition (SVD), as well as kNN (nearest neighbor) and neural network models; temporal effects, cross-validation issues, ensemble methods and other considerations are discussed as well. We compare existing models in a search for new models, and also discuss the mission-critical issues of penalization and parameter shrinkage which arise when the dimensions of a parameter space reaches into the millions. Although much work on such problems has been carried out by the computer science and machine learning communities, our goal here is to address a statistical audience, and to provide a primarily statistical treatment of the lessons that have been learned from this remarkable set of data.Comment: Published in at http://dx.doi.org/10.1214/11-STS368 the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org

    A Distributed, Architecture-Centric Approach to Computing Accurate Recommendations from Very Large and Sparse Datasets

    Get PDF
    The use of recommender systems is an emerging trend today, when user behavior information is abundant. There are many large datasets available for analysis because many businesses are interested in future user opinions. Sophisticated algorithms that predict such opinions can simplify decision-making, improve customer satisfaction, and increase sales. However, modern datasets contain millions of records, which represent only a small fraction of all possible data. Furthermore, much of the information in such sparse datasets may be considered irrelevant for making individual recommendations. As a result, there is a demand for a way to make personalized suggestions from large amounts of noisy data. Current recommender systems are usually all-in-one applications that provide one type of recommendation. Their inflexible architectures prevent detailed examination of recommendation accuracy and its causes. We introduce a novel architecture model that supports scalable, distributed suggestions from multiple independent nodes. Our model consists of two components, the input matrix generation algorithm and multiple platform-independent combination algorithms. A dedicated input generation component provides the necessary data for combination algorithms, reduces their size, and eliminates redundant data processing. Likewise, simple combination algorithms can produce recommendations from the same input, so we can more easily distinguish between the benefits of a particular combination algorithm and the quality of the data it receives. Such flexible architecture is more conducive for a comprehensive examination of our system. We believe that a user's future opinion may be inferred from a small amount of data, provided that this data is most relevant. We propose a novel algorithm that generates a more optimal recommender input. Unlike existing approaches, our method sorts the relevant data twice. Doing this is slower, but the quality of the resulting input is considerably better. Furthermore, the modular nature of our approach may improve its performance, especially in the cloud computing context. We implement and validate our proposed model via mathematical modeling, by appealing to statistical theories, and through extensive experiments, data analysis, and empirical studies. Our empirical study examines the effectiveness of accuracy improvement techniques for collaborative filtering recommender systems. We evaluate our proposed architecture model on the Netflix dataset, a popular (over 130,000 solutions), large (over 100,000,000 records), and extremely sparse (1.1\%) collection of movie ratings. The results show that combination algorithm tuning has little effect on recommendation accuracy. However, all algorithms produce better results when supplied with a more relevant input. Our input generation algorithm is the reason for a considerable accuracy improvement

    Evaluating collaborative filtering over time

    Get PDF
    Recommender systems have become essential tools for users to navigate the plethora of content in the online world. Collaborative filtering—a broad term referring to the use of a variety, or combination, of machine learning algorithms operating on user ratings—lies at the heart of recommender systems’ success. These algorithms have been traditionally studied from the point of view of how well they can predict users’ ratings and how precisely they rank content; state of the art approaches are continuously improved in these respects. However, a rift has grown between how filtering algorithms are investigated and how they will operate when deployed in real systems. Deployed systems will continuously be queried for personalised recommendations; in practice, this implies that system administrators will iteratively retrain their algorithms in order to include the latest ratings. Collaborative filtering research does not take this into account: algorithms are improved and compared to each other from a static viewpoint, while they will be ultimately deployed in a dynamic setting. Given this scenario, two new problems emerge: current filtering algorithms are neither (a) designed nor (b) evaluated as algorithms that must account for time. This thesis addresses the divergence between research and practice by examining how collaborative filtering algorithms behave over time. Our contributions include: 1. A fine grained analysis of temporal changes in rating data and user/item similarity graphs that clearly demonstrates how recommender system data is dynamic and constantly changing. 2. A novel methodology and time-based metrics for evaluating collaborative filtering over time, both in terms of accuracy and the diversity of top-N recommendations. 3. A set of hybrid algorithms that improve collaborative filtering in a range of different scenarios. These include temporal-switching algorithms that aim to promote either accuracy or diversity; parameter update methods to improve temporal accuracy; and re-ranking a subset of users’ recommendations in order to increase diversity. 4. A set of temporal monitors that secure collaborative filtering from a wide range of different temporal attacks by flagging anomalous rating patterns. We have implemented and extensively evaluated the above using large-scale sets of user ratings; we further discuss how this novel methodology provides insight into dimensions of recommender systems that were previously unexplored. We conclude that investigating collaborative filtering from a temporal perspective is not only more suitable to the context in which recommender systems are deployed, but also opens a number of future research opportunities

    Interview with Simon Funk

    No full text
    corecore