4,822 research outputs found
CLEF 2017 NewsREEL Overview: Offline and Online Evaluation of Stream-based News Recommender Systems
The CLEF NewsREEL challenge allows researchers to evaluate news
recommendation algorithms both online (NewsREEL Live) and offline (News-
REEL Replay). Compared with the previous year NewsREEL challenged participants
with a higher volume of messages and new news portals. In the 2017
edition of the CLEF NewsREEL challenge a wide variety of new approaches have
been implemented ranging from the use of existing machine learning frameworks,
to ensemble methods to the use of deep neural networks. This paper gives an
overview over the implemented approaches and discusses the evaluation results.
In addition, the main results of Living Lab and the Replay task are explained
CLEF NewsREEL 2016: Comparing Multi-Dimensional Offline and Online Evaluation of News Recommender Systems
Running in its third year at CLEF, NewsREEL challenged participants
to develop news recommendation algorithms and have them benchmarked in
an online (Task 1) and offline setting (Task 2), respectively. This paper provides
an overview of the NewsREEL scenario, outlines this year’s campaign, presents
results of both tasks, and discusses the approaches of participating teams. Moreover,
it overviews ideas on living lab evaluation that have been presented as part
of a “New Ideas” track at the conference in Portugal. Presented results illustrate
potentials for multi-dimensional evaluation of recommendation algorithms in
a living lab and simulation based evaluation setting
Evaluation of recommender systems in streaming environments
Evaluation of recommender systems is typically done with finite datasets.
This means that conventional evaluation methodologies are only applicable in
offline experiments, where data and models are stationary. However, in real
world systems, user feedback is continuously generated, at unpredictable rates.
Given this setting, one important issue is how to evaluate algorithms in such a
streaming data environment. In this paper we propose a prequential evaluation
protocol for recommender systems, suitable for streaming data environments, but
also applicable in stationary settings. Using this protocol we are able to
monitor the evolution of algorithms' accuracy over time. Furthermore, we are
able to perform reliable comparative assessments of algorithms by computing
significance tests over a sliding window. We argue that besides being suitable
for streaming data, prequential evaluation allows the detection of phenomena
that would otherwise remain unnoticed in the evaluation of both offline and
online recommender systems.Comment: Workshop on 'Recommender Systems Evaluation: Dimensions and Design'
(REDD 2014), held in conjunction with RecSys 2014. October 10, 2014, Silicon
Valley, United State
Reducing Offline Evaluation Bias in Recommendation Systems
Recommendation systems have been integrated into the majority of large online
systems. They tailor those systems to individual users by filtering and ranking
information according to user profiles. This adaptation process influences the
way users interact with the system and, as a consequence, increases the
difficulty of evaluating a recommendation algorithm with historical data (via
offline evaluation). This paper analyses this evaluation bias and proposes a
simple item weighting solution that reduces its impact. The efficiency of the
proposed solution is evaluated on real world data extracted from Viadeo
professional social network.Comment: 23rd annual Belgian-Dutch Conference on Machine Learning (Benelearn
2014), Bruxelles : Belgium (2014
Whole-Chain Recommendations
With the recent prevalence of Reinforcement Learning (RL), there have been
tremendous interests in developing RL-based recommender systems. In practical
recommendation sessions, users will sequentially access multiple scenarios,
such as the entrance pages and the item detail pages, and each scenario has its
specific characteristics. However, the majority of existing RL-based
recommender systems focus on optimizing one strategy for all scenarios or
separately optimizing each strategy, which could lead to sub-optimal overall
performance. In this paper, we study the recommendation problem with multiple
(consecutive) scenarios, i.e., whole-chain recommendations. We propose a
multi-agent RL-based approach (DeepChain), which can capture the sequential
correlation among different scenarios and jointly optimize multiple
recommendation strategies. To be specific, all recommender agents (RAs) share
the same memory of users' historical behaviors, and they work collaboratively
to maximize the overall reward of a session. Note that optimizing multiple
recommendation strategies jointly faces two challenges in the existing
model-free RL model - (i) it requires huge amounts of user behavior data, and
(ii) the distribution of reward (users' feedback) are extremely unbalanced. In
this paper, we introduce model-based RL techniques to reduce the training data
requirement and execute more accurate strategy updates. The experimental
results based on a real e-commerce platform demonstrate the effectiveness of
the proposed framework.Comment: 29th ACM International Conference on Information and Knowledge
Managemen
- …