620 research outputs found

    Evaluating Recommender Systems Qualitatively: A survey and Comparative Analysis

    Get PDF
    Dissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced Analytics, specialization in Business AnalyticsRecommender systems have improved users' online quality of life by helping them find interesting and valuable items within a large item set. Most recommender system validation research has focused on accuracy metrics, studying the differences between the predicted and actual user ratings. However, recent research has found accuracy to underperform when systems go live, mainly due to accuracy’s inability to validate recommendation lists as a single entity, and shifted to evaluating recommender systems using "beyond-accuracy" metrics, like novelty and diversity. In this dissertation, we summarize and organize the leading research regarding the definitions and objectives of the beyond-accuracy metrics. Such metrics include coverage, diversity, novelty, serendipity, unexpectedness, utility, and fairness. The behaviors and relationships of these metrics are analyzed using four different models, two concerning the items characteristics (item-based) and two regarding the user behaviors (user-based). Furthermore, a new metric is proposed that allows the comparison of different models considering their overall beyond-accuracy performance. Using this metric, a reraking approach is designed to improve the performance of a system, aiming to achieve better recommendations. The impact of the reranking technique on each metric and algorithm is studied, and the accuracy and non-accuracy performance of each system is compared. We realized that, although the reranking technique can increase most beyond-accuracy metrics, the accuracy of that system starts to worsen due to the negative correlation between these two dimensions. We also found that item-based models tend to achieve much lower values of coverage and diversity than userbased models

    Evaluating Recommender Systems for Technology Enhanced Learning: A Quantitative Survey

    Get PDF
    The increasing number of publications on recommender systems for Technology Enhanced Learning (TEL) evidence a growing interest in their development and deployment. In order to support learning, recommender systems for TEL need to consider specific requirements, which differ from the requirements for recommender systems in other domains like e-commerce. Consequently, these particular requirements motivate the incorporation of specific goals and methods in the evaluation process for TEL recommender systems. In this article, the diverse evaluation methods that have been applied to evaluate TEL recommender systems are investigated. A total of 235 articles are selected from major conferences, workshops, journals, and books where relevant work have been published between 2000 and 2014. These articles are quantitatively analysed and classified according to the following criteria: type of evaluation methodology, subject of evaluation, and effects measured by the evaluation. Results from the survey suggest that there is a growing awareness in the research community of the necessity for more elaborate evaluations. At the same time, there is still substantial potential for further improvements. This survey highlights trends and discusses strengths and shortcomings of the evaluation of TEL recommender systems thus far, thereby aiming to stimulate researchers to contemplate novel evaluation approaches.Laboratorio de Investigación y Formación en Informática Avanzad

    Evaluating Recommender Systems for Technology Enhanced Learning: A Quantitative Survey

    Get PDF
    The increasing number of publications on recommender systems for Technology Enhanced Learning (TEL) evidence a growing interest in their development and deployment. In order to support learning, recommender systems for TEL need to consider specific requirements, which differ from the requirements for recommender systems in other domains like e-commerce. Consequently, these particular requirements motivate the incorporation of specific goals and methods in the evaluation process for TEL recommender systems. In this article, the diverse evaluation methods that have been applied to evaluate TEL recommender systems are investigated. A total of 235 articles are selected from major conferences, workshops, journals, and books where relevant work have been published between 2000 and 2014. These articles are quantitatively analysed and classified according to the following criteria: type of evaluation methodology, subject of evaluation, and effects measured by the evaluation. Results from the survey suggest that there is a growing awareness in the research community of the necessity for more elaborate evaluations. At the same time, there is still substantial potential for further improvements. This survey highlights trends and discusses strengths and shortcomings of the evaluation of TEL recommender systems thus far, thereby aiming to stimulate researchers to contemplate novel evaluation approaches.Laboratorio de Investigación y Formación en Informática Avanzad

    Evaluating Recommender Systems: Survey and Framework

    Get PDF
    The comprehensive evaluation of the performance of a recommender system is a complex endeavor: many facets need to be considered in configuring an adequate and effective evaluation setting. Such facets include, for instance, defining the specific goals of the evaluation, choosing an evaluation method, underlying data, and suitable evaluation metrics. In this paper, we consolidate and systematically organize this dispersed knowledge on recommender systems evaluation. We introduce the “Framework for EValuating Recommender systems” (FEVR) that we derive from the discourse on recommender systems evaluation. In FEVR, we categorize the evaluation space of recommender systems evaluation. We postulate that the comprehensive evaluation of a recommender system frequently requires considering multiple facets and perspectives in the evaluation. The FEVR framework provides a structured foundation to adopt adequate evaluation configurations that encompass this required multi-facettedness and provides the basis to advance in the field. We outline and discuss the challenges of a comprehensive evaluation of recommender systems, and provide an outlook on what we need to embrace and do to move forward as a research community

    A Distributed and Accountable Approach to Offline Recommender Systems Evaluation

    Get PDF
    Different software tools have been developed with the purpose of performing offline evaluations of recommender systems. However, the results obtained with these tools may be not directly comparable because of subtle differences in the experimental protocols and metrics. Furthermore, it is difficult to analyze in the same experimental conditions several algorithms without disclosing their implementation details. For these reasons, we introduce RecLab, an open source software for evaluating recommender systems in a distributed fashion. By relying on consolidated web protocols, we created RESTful APIs for training and querying recommenders remotely. In this way, it is possible to easily integrate into the same toolkit algorithms realized with different technologies. In details, the experimenter can perform an evaluation by simply visiting a web interface provided by RecLab. The framework will then interact with all the selected recommenders and it will compute and display a comprehensive set of measures, each representing a different metric. The results of all experiments are permanently stored and publicly available in order to support accountability and comparative analyses.Comment: REVEAL 2018 Workshop on Offline Evaluation for Recommender System

    Exploring Data Splitting Strategies for the Evaluation of Recommendation Models

    Get PDF
    Effective methodologies for evaluating recommender systems are critical, so that different systems can be compared in a sound manner. A commonly overlooked aspect of evaluating recommender systems is the selection of the data splitting strategy. In this paper, we both show that there is no standard splitting strategy and that the selection of splitting strategy can have a strong impact on the ranking of recommender systems during evaluation. In particular, we perform experiments comparing three common data splitting strategies, examining their impact over seven state-of-the-art recommendation models on two datasets. Our results demonstrate that the splitting strategy employed is an important confounding variable that can markedly alter the ranking of recommender systems, making much of the currently published literature non-comparable, even when the same datasets and metrics are used

    Evaluating recommender systems from the user's perspective: survey of the state of the art

    Get PDF
    A recommender system is a Web technology that proactively suggests items of interest to users based on their objective behavior or explicitly stated preferences. Evaluations of recommender systems (RS) have traditionally focused on the performance of algorithms. However, many researchers have recently started investigating system effectiveness and evaluation criteria from users' perspectives. In this paper, we survey the state of the art of user experience research in RS by examining how researchers have evaluated design methods that augment RS's ability to help users find the information or product that they truly prefer, interact with ease with the system, and form trust with RS through system transparency, control and privacy preserving mechanisms finally, we examine how these system design features influence users' adoption of the technology. We summarize existing work concerning three crucial interaction activities between the user and the system: the initial preference elicitation process, the preference refinement process, and the presentation of the system's recommendation results. Additionally, we will also cover recent evaluation frameworks that measure a recommender system's overall perceptive qualities and how these qualities influence users' behavioral intentions. The key results are summarized in a set of design guidelines that can provide useful suggestions to scholars and practitioners concerning the design and development of effective recommender systems. The survey also lays groundwork for researchers to pursue future topics that have not been covered by existing method
    corecore