3 research outputs found
R-UCB: a Contextual Bandit Algorithm for Risk-Aware Recommender Systems
Mobile Context-Aware Recommender Systems can be naturally modelled as an
exploration/exploitation trade-off (exr/exp) problem, where the system has to
choose between maximizing its expected rewards dealing with its current
knowledge (exploitation) and learning more about the unknown user's preferences
to improve its knowledge (exploration). This problem has been addressed by the
reinforcement learning community but they do not consider the risk level of the
current user's situation, where it may be dangerous to recommend items the user
may not desire in her current situation if the risk level is high. We introduce
in this paper an algorithm named R-UCB that considers the risk level of the
user's situation to adaptively balance between exr and exp. The detailed
analysis of the experimental results reveals several important discoveries in
the exr/exp behaviour
Conditionally Risk-Averse Contextual Bandits
Contextual bandits with average-case statistical guarantees are inadequate in
risk-averse situations because they might trade off degraded worst-case
behaviour for better average performance. Designing a risk-averse contextual
bandit is challenging because exploration is necessary but risk-aversion is
sensitive to the entire distribution of rewards; nonetheless we exhibit the
first risk-averse contextual bandit algorithm with an online regret guarantee.
We conduct experiments from diverse scenarios where worst-case outcomes should
be avoided, from dynamic pricing, inventory management, and self-tuning
software; including a production exascale data processing system
Learning Personalized Risk Preferences for Recommendation
The rapid growth of e-commerce has made people accustomed to shopping online.
Before making purchases on e-commerce websites, most consumers tend to rely on
rating scores and review information to make purchase decisions. With this
information, they can infer the quality of products to reduce the risk of
purchase. Specifically, items with high rating scores and good reviews tend to
be less risky, while items with low rating scores and bad reviews might be
risky to purchase. On the other hand, the purchase behaviors will also be
influenced by consumers' tolerance of risks, known as the risk attitudes.
Economists have studied risk attitudes for decades. These studies reveal that
people are not always rational enough when making decisions, and their risk
attitudes may vary in different circumstances.
Most existing works over recommendation systems do not consider users' risk
attitudes in modeling, which may lead to inappropriate recommendations to
users. For example, suggesting a risky item to a risk-averse person or a
conservative item to a risk-seeking person may result in the reduction of user
experience. In this paper, we propose a novel risk-aware recommendation
framework that integrates machine learning and behavioral economics to uncover
the risk mechanism behind users' purchasing behaviors. Concretely, we first
develop statistical methods to estimate the risk distribution of each item and
then draw the Nobel-award winning Prospect Theory into our model to learn how
users choose from probabilistic alternatives that involve risks, where the
probabilities of the outcomes are uncertain. Experiments on several e-commerce
datasets demonstrate that our approach can achieve better performance than many
classical recommendation approaches, and further analyses also verify the
advantages of risk-aware recommendation beyond accuracy