Search CORE

324 research outputs found

Carousel Personalization in Music Streaming Apps with Contextual Bandits

Author: Agarwal Alekh
Chu Wei
Garivier Aurélien
Jiang Ray
Katariya Sumeet
Komiyama Junpei
Kveton Branislav
Wang Zhiyang
Zhang Shuai
Zhou Li
Zoghi Masrour
Zong Shi
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 30/09/2020
Field of study

Media services providers, such as music streaming platforms, frequently leverage swipeable carousels to recommend personalized content to their users. However, selecting the most relevant items (albums, artists, playlists...) to display in these carousels is a challenging task, as items are numerous and as users have different preferences. In this paper, we model carousel personalization as a contextual multi-armed bandit problem with multiple plays, cascade-based updates and delayed batch feedback. We empirically show the effectiveness of our framework at capturing characteristics of real-world carousels by addressing a large-scale playlist recommendation task on a global music streaming mobile app. Along with this paper, we publicly release industrial data from our experiments, as well as an open-source environment to simulate comparable carousel personalization learning problems.Comment: 14th ACM Conference on Recommender Systems (RecSys 2020, Best Short Paper Candidate

arXiv.org e-Print Archive

Crossref

Cascading Bandits for Large-Scale Recommendation Problems

Author: Adobe Research
Adobe Research
Branislav Kveton
CA San Jose
CA San Jose
Hao Ni
Kenny Sung
Nan Rosemary Ke
Shi Zong
Zheng Wen
Publication venue
Publication date: 10/04/2020
Field of study

Abstract Most recommender systems recommend a list of items. The user examines the list, from the first item to the last, and often chooses the first attractive item and does not examine the rest. This type of user behavior can be modeled by the cascade model. In this work, we study cascading bandits, an online learning variant of the cascade model where the goal is to recommend K most attractive items from a large set of L candidate items. We propose two algorithms for solving this problem, which are based on the idea of linear generalization. The key idea in our solutions is that we learn a predictor of the attraction probabilities of items from their features, as opposing to learning the attraction probability of each item independently as in the existing work. This results in practical learning algorithms whose regret does not depend on the number of items L. We bound the regret of one algorithm and comprehensively evaluate the other on a range of recommendation problems. The algorithm performs well and outperforms all baselines

CiteSeerX

Optimising Human-AI Collaboration by Learning Convincing Explanations

Author: Chan Alex J.
Huyuk Alihan
van der Schaar Mihaela
Publication venue
Publication date: 13/11/2023
Field of study

Machine learning models are being increasingly deployed to take, or assist in taking, complicated and high-impact decisions, from quasi-autonomous vehicles to clinical decision support systems. This poses challenges, particularly when models have hard-to-detect failure modes and are able to take actions without oversight. In order to handle this challenge, we propose a method for a collaborative system that remains safe by having a human ultimately making decisions, while giving the model the best opportunity to convince and debate them with interpretable explanations. However, the most helpful explanation varies among individuals and may be inconsistent across stated preferences. To this end we develop an algorithm, Ardent, to efficiently learn a ranking through interaction and best assist humans complete a task. By utilising a collaborative approach, we can ensure safety and improve performance while addressing transparency and accountability concerns. Ardent enables efficient and effective decision-making by adapting to individual preferences for explanations, which we validate through extensive simulations alongside a user study involving a challenging image classification task, demonstrating consistent improvement over competing systems

arXiv.org e-Print Archive

Online Clustering of Bandits with Misspecified User Models

Author: Li Shuai
Liu Xutong
Lui John C. S.
Wang Zhiyong
Xie Jize
Publication venue
Publication date: 10/10/2023
Field of study

The contextual linear bandit is an important online learning problem where given arm features, a learning agent selects an arm at each round to maximize the cumulative rewards in the long run. A line of works, called the clustering of bandits (CB), utilize the collaborative effect over user preferences and have shown significant improvements over classic linear bandit algorithms. However, existing CB algorithms require well-specified linear user models and can fail when this critical assumption does not hold. Whether robust CB algorithms can be designed for more practical scenarios with misspecified user models remains an open problem. In this paper, we are the first to present the important problem of clustering of bandits with misspecified user models (CBMUM), where the expected rewards in user models can be perturbed away from perfect linear models. We devise two robust CB algorithms, RCLUMB and RSCLUMB (representing the learned clustering structure with dynamic graph and sets, respectively), that can accommodate the inaccurate user preference estimations and erroneous clustering caused by model misspecifications. We prove regret upper bounds of

O(\epsilon_*T\sqrt{md\log T} + d\sqrt{mT}\log T)

for our algorithms under milder assumptions than previous CB works (notably, we move past a restrictive technical assumption on the distribution of the arms), which match the lower bound asymptotically in

T

up to logarithmic factors, and also match the state-of-the-art results in several degenerate cases. The techniques in proving the regret caused by misclustering users are quite general and may be of independent interest. Experiments on both synthetic and real-world data show our outperformance over previous algorithms

arXiv.org e-Print Archive

Interactive social recommendation

Author: Amin K.
Berry D.
Chapelle O.
Chu W.
Filippi S.
Gentile C.
Gittins J.
Kawale J.
Langford J.
Mnih A.
N.
Purushotham S.
Slivkins A.
Sutton R.
Wang H.
Wang X.
Y.
Yang B.
Yue Y.
Zhou L.
Zong S.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/11/2017
Field of study

National Research Foundation (NRF) Singapore under its International Research Centres in Singapore Funding Initiativ

Crossref

Institutional Knowledge at Singapore Management University