876 research outputs found
Rank and relevance in novelty and diversity metrics for recommender systems
This is the author's version of the work. It is posted here for your personal use. Not for redistribution. The definitive Version of Record was published in RecSys '11 Proceedings of the fifth ACM conference on Recommender systems, http://dx.doi.org/10.1145/2043932.2043955The Recommender Systems community is paying increasing attention to novelty and diversity as key qualities beyond accuracy in real recommendation scenarios. Despite the raise of interest and work on the topic in recent years, we find that a clear common methodological and conceptual ground for the evaluation of these dimensions is still to be consolidated. Different evaluation metrics have been reported in the literature but the precise relation, distinction or equivalence between them has not been explicitly studied. Furthermore, the metrics reported so far miss important properties such as taking into consideration the ranking of recommended items, or whether items are relevant or not, when assessing the novelty and diversity of recommendations.
We present a formal framework for the definition of novelty and diversity metrics that unifies and generalizes several state of the art metrics. We identify three essential ground concepts at the roots of novelty and diversity: choice, discovery and relevance, upon which the framework is built. Item rank and relevance are introduced through a probabilistic recommendation browsing model, building upon the same three basic concepts. Based on the combination of ground elements, and the assumptions of the browsing model, different metrics and variants unfold. We report experimental observations which validate and illustrate the properties of the proposed metrics.This work is supported by the Spanish Government (TIN2011-
28538-C02-01), and the Government of Madrid (S2009TIC-1542)
Off-line vs. On-line Evaluation of Recommender Systems in Small E-commerce
In this paper, we present our work towards comparing on-line and off-line
evaluation metrics in the context of small e-commerce recommender systems.
Recommending on small e-commerce enterprises is rather challenging due to the
lower volume of interactions and low user loyalty, rarely extending beyond a
single session. On the other hand, we usually have to deal with lower volumes
of objects, which are easier to discover by users through various
browsing/searching GUIs.
The main goal of this paper is to determine applicability of off-line
evaluation metrics in learning true usability of recommender systems (evaluated
on-line in A/B testing). In total 800 variants of recommending algorithms were
evaluated off-line w.r.t. 18 metrics covering rating-based, ranking-based,
novelty and diversity evaluation. The off-line results were afterwards compared
with on-line evaluation of 12 selected recommender variants and based on the
results, we tried to learn and utilize an off-line to on-line results
prediction model.
Off-line results shown a great variance in performance w.r.t. different
metrics with the Pareto front covering 68\% of the approaches. Furthermore, we
observed that on-line results are considerably affected by the novelty of
users. On-line metrics correlates positively with ranking-based metrics (AUC,
MRR, nDCG) for novice users, while too high values of diversity and novelty had
a negative impact on the on-line results for them. For users with more visited
items, however, the diversity became more important, while ranking-based
metrics relevance gradually decrease.Comment: Submitted to ACM Hypertext 2020 Conferenc
A Distributed and Accountable Approach to Offline Recommender Systems Evaluation
Different software tools have been developed with the purpose of performing
offline evaluations of recommender systems. However, the results obtained with
these tools may be not directly comparable because of subtle differences in the
experimental protocols and metrics. Furthermore, it is difficult to analyze in
the same experimental conditions several algorithms without disclosing their
implementation details. For these reasons, we introduce RecLab, an open source
software for evaluating recommender systems in a distributed fashion. By
relying on consolidated web protocols, we created RESTful APIs for training and
querying recommenders remotely. In this way, it is possible to easily integrate
into the same toolkit algorithms realized with different technologies. In
details, the experimenter can perform an evaluation by simply visiting a web
interface provided by RecLab. The framework will then interact with all the
selected recommenders and it will compute and display a comprehensive set of
measures, each representing a different metric. The results of all experiments
are permanently stored and publicly available in order to support
accountability and comparative analyses.Comment: REVEAL 2018 Workshop on Offline Evaluation for Recommender System
Recommending Items in Social Tagging Systems Using Tag and Time Information
In this work we present a novel item recommendation approach that aims at
improving Collaborative Filtering (CF) in social tagging systems using the
information about tags and time. Our algorithm follows a two-step approach,
where in the first step a potentially interesting candidate item-set is found
using user-based CF and in the second step this candidate item-set is ranked
using item-based CF. Within this ranking step we integrate the information of
tag usage and time using the Base-Level Learning (BLL) equation coming from
human memory theory that is used to determine the reuse-probability of words
and tags using a power-law forgetting function.
As the results of our extensive evaluation conducted on data-sets gathered
from three social tagging systems (BibSonomy, CiteULike and MovieLens) show,
the usage of tag-based and time information via the BLL equation also helps to
improve the ranking and recommendation process of items and thus, can be used
to realize an effective item recommender that outperforms two alternative
algorithms which also exploit time and tag-based information.Comment: 6 pages, 2 tables, 9 figure
- …