932 research outputs found

    Scalable Collaborative Filtering Recommendation Algorithms on Apache Spark

    Get PDF
    Collaborative filtering based recommender systems use information about a user\u27s preferences to make personalized predictions about content, such as topics, people, or products, that they might find relevant. As the volume of accessible information and active users on the Internet continues to grow, it becomes increasingly difficult to compute recommendations quickly and accurately over a large dataset. In this study, we will introduce an algorithmic framework built on top of Apache Spark for parallel computation of the neighborhood-based collaborative filtering problem, which allows the algorithm to scale linearly with a growing number of users. We also investigate several different variants of this technique including user and item-based recommendation approaches, correlation and vector-based similarity calculations, and selective down-sampling of user interactions. Finally, we provide an experimental comparison of these techniques on the MovieLens dataset consisting of 10 million movie ratings

    BFSMpR:A BFS Graph based Recommendation System using Map Reduce

    Get PDF
    Nowadays, Many associations, organizations and analysts need to manage huge datasets (i.e. Terabytes or even Petabytes). A well-known information filtering algorithm for dealing with such large datasets in an effective way is Hadoop Map Reduce. These large size datasets are regularly known to as graphs by many frameworks of current intrigue (i.e. Web, informal organization). A key element of the graph based recommendation system is that they depend upon the neighbor’s interest by taking minimum distance into account. Generally recent day proposal frameworks utilize complex strategy to give recommend to every user. This paper introduced an alternate approach to give suggestions to users in used of an un-weighted graph using a Hadoop iterative MapReduce approach for the execution.

    Product Recommendation using Hadoop

    Get PDF
    Recommendation systems are used widely to provide personalized recommendations to users. Such systems are used by e-commerce and social networking websites to increase their business and user engagement. Day-to-day growth of customers and products pose a challenge for generating high quality recommendations. Moreover, they are even needed to perform many recommendations per second, for millions of customers and products. In such scenarios, implementing a recommendation algorithm sequentially has large performance issues. To address such issues, we propose a parallel algorithm to generate recommendations by using Hadoop map-reduce framework. In this implementation, we will focus on item-based collaborative filtering technique based on user's browsing history, which is a well-known technique to generate recommendations

    A User- Based Recommendation with a Scalable Machine Learning Tool

    Get PDF
    Recommender Systems have proven to be valuable way for online users to recommend information items like books, videos, songs etc.colloborative filtering methods are used to make all predictions from historical data. In this paper we introduce Apache mahout which is an open source and provides a rich set of components to construct a customized recommender system from a selection of machine learning algorithms.[12] This paper also focuses on addressing the challenges in collaborative filtering like scalability and data sparsity. To deal with scalability problems, we go with a distributed frame work like hadoop. We then present a customized user based recommender system

    A Survey of Recommendation Systems and Performance Enhancing Methods

    Get PDF
    With the development of web services like E-commerce, job hunting websites, movie websites, recommendation system plays a more and more importance role in helping users finding their potential interests among the overloading information. There are a great number of researches available in this field, which leads to various recommendation approaches to choose from when researchers try to implement their recommendation systems. This paper gives a systematic literature review of recommendation systems where the sources are extracted from Scopus. The research problem to address, similarity metrics used, proposed method and evaluation metrics used are the focus of summary of these papers. In spite of the methodology used in traditional recommendation systems, how additional performance enhancement methods like machine learning methods, matrix factorization techniques and big data tools are applied in several papers are also introduced. Through reading this paper, researchers are able to understand what are the existing types of recommendation systems, what is the general process of recommendation systems, how the performance enhancement methods can be used to improve the system's performance. Therefore, they can choose a recommendation system which interests them for either implementation or research purpose

    In-memory, distributed content-based recommender system

    Get PDF
    Burdened by their popularity, recommender systems increasingly take on larger datasets while they are expected to deliver high quality results within reasonable time. To meet these ever growing requirements, industrial recommender systems often turn to parallel hardware and distributed computing. While the MapReduce paradigm is generally accepted for massive parallel data processing, it often entails complex algorithm reorganization and suboptimal efficiency because mid-computation values are typically read from and written to hard disk. This work implements an in-memory, content-based recommendation algorithm and shows how it can be parallelized and efficiently distributed across many homogeneous machines in a distributed-memory environment. By focusing on data parallelism and carefully constructing the definition of work in the context of recommender systems, we are able to partition the complete calculation process into any number of independent and equally sized jobs. An empirically validated performance model is developed to predict parallel speedup and promises high efficiencies for realistic hardware configurations. For the MovieLens 10 M dataset we note efficiency values up to 71 % for a configuration of 200 computing nodes (eight cores per node)
    • …
    corecore