5,177 research outputs found
Benchmarking News Recommendations in a Living Lab
Most user-centric studies of information access systems in literature suffer from unrealistic settings or limited numbers of users who participate in the study. In order to address this issue, the idea of a living lab has been promoted. Living labs allow us to evaluate research hypotheses using a large number of users who satisfy their information need in a real context. In this paper, we introduce a living lab on news recommendation in real time. The living lab has first been organized as News Recommendation Challenge at ACM RecSys’13 and then as campaign-style evaluation lab NEWSREEL at CLEF’14. Within this lab, researchers were asked to provide news article recommendations to millions of users in real time. Different from user studies which have been performed in a laboratory, these users are following their own agenda. Consequently, laboratory bias on their behavior can be neglected. We outline the living lab scenario and the experimental setup of the two benchmarking events. We argue that the living lab can serve as reference point for the implementation of living labs for the evaluation of information access systems
Off-line vs. On-line Evaluation of Recommender Systems in Small E-commerce
In this paper, we present our work towards comparing on-line and off-line
evaluation metrics in the context of small e-commerce recommender systems.
Recommending on small e-commerce enterprises is rather challenging due to the
lower volume of interactions and low user loyalty, rarely extending beyond a
single session. On the other hand, we usually have to deal with lower volumes
of objects, which are easier to discover by users through various
browsing/searching GUIs.
The main goal of this paper is to determine applicability of off-line
evaluation metrics in learning true usability of recommender systems (evaluated
on-line in A/B testing). In total 800 variants of recommending algorithms were
evaluated off-line w.r.t. 18 metrics covering rating-based, ranking-based,
novelty and diversity evaluation. The off-line results were afterwards compared
with on-line evaluation of 12 selected recommender variants and based on the
results, we tried to learn and utilize an off-line to on-line results
prediction model.
Off-line results shown a great variance in performance w.r.t. different
metrics with the Pareto front covering 68\% of the approaches. Furthermore, we
observed that on-line results are considerably affected by the novelty of
users. On-line metrics correlates positively with ranking-based metrics (AUC,
MRR, nDCG) for novice users, while too high values of diversity and novelty had
a negative impact on the on-line results for them. For users with more visited
items, however, the diversity became more important, while ranking-based
metrics relevance gradually decrease.Comment: Submitted to ACM Hypertext 2020 Conferenc
Ontology-Based Recommendation of Editorial Products
Major academic publishers need to be able to analyse their vast catalogue of products and select the best items to be marketed in scientific venues. This is a complex exercise that requires characterising with a high precision the topics of thousands of books and matching them with the interests of the relevant communities. In Springer Nature, this task has been traditionally handled manually by publishing editors. However, the rapid growth in the number of scientific publications and the dynamic nature of the Computer Science landscape has made this solution increasingly inefficient. We have addressed this issue by creating Smart Book Recommender (SBR), an ontology-based recommender system developed by The Open University (OU) in collaboration with Springer Nature, which supports their Computer Science editorial team in selecting the products to market at specific venues. SBR recommends books, journals, and conference proceedings relevant to a conference by taking advantage of a semantically enhanced representation of about 27K editorial products. This is based on the Computer Science Ontology, a very large-scale, automatically generated taxonomy of research areas. SBR also allows users to investigate why a certain publication was suggested by the system. It does so by means of an interactive graph view that displays the topic taxonomy of the recommended editorial product and compares it with the topic-centric characterization of the input conference. An evaluation carried out with seven Springer Nature editors and seven OU researchers has confirmed the effectiveness of the solution
Recommender systems fairness evaluation via generalized cross entropy
Fairness in recommender systems has been considered with respect
to sensitive attributes of users (e.g., gender, race) or items (e.g., revenue
in a multistakeholder setting). Regardless, the concept has been
commonly interpreted as some form of equality – i.e., the degree to
which the system is meeting the information needs of all its users in
an equal sense. In this paper, we argue that fairness in recommender
systems does not necessarily imply equality, but instead it should
consider a distribution of resources based on merits and needs.We
present a probabilistic framework based ongeneralized cross entropy
to evaluate fairness of recommender systems under this perspective,
wherewe showthat the proposed framework is flexible and explanatory
by allowing to incorporate domain knowledge (through an ideal
fair distribution) that can help to understand which item or user aspects
a recommendation algorithm is over- or under-representing.
Results on two real-world datasets show the merits of the proposed
evaluation framework both in terms of user and item fairnessThis work was supported in part by the Center for Intelligent Information
Retrieval and in part by project TIN2016-80630-P (MINECO
- …