33,540 research outputs found
R-UCB: a Contextual Bandit Algorithm for Risk-Aware Recommender Systems
Mobile Context-Aware Recommender Systems can be naturally modelled as an
exploration/exploitation trade-off (exr/exp) problem, where the system has to
choose between maximizing its expected rewards dealing with its current
knowledge (exploitation) and learning more about the unknown user's preferences
to improve its knowledge (exploration). This problem has been addressed by the
reinforcement learning community but they do not consider the risk level of the
current user's situation, where it may be dangerous to recommend items the user
may not desire in her current situation if the risk level is high. We introduce
in this paper an algorithm named R-UCB that considers the risk level of the
user's situation to adaptively balance between exr and exp. The detailed
analysis of the experimental results reveals several important discoveries in
the exr/exp behaviour
- …