Search CORE

419 research outputs found

Dynamic Learning of Sequential Choice Bandit Problem under Marketing Fatigue

Author: Cao Junyu
Sun Wei
Publication venue
Publication date: 19/03/2019
Field of study

Motivated by the observation that overexposure to unwanted marketing activities leads to customer dissatisfaction, we consider a setting where a platform offers a sequence of messages to its users and is penalized when users abandon the platform due to marketing fatigue. We propose a novel sequential choice model to capture multiple interactions taking place between the platform and its user: Upon receiving a message, a user decides on one of the three actions: accept the message, skip and receive the next message, or abandon the platform. Based on user feedback, the platform dynamically learns users' abandonment distribution and their valuations of messages to determine the length of the sequence and the order of the messages, while maximizing the cumulative payoff over a horizon of length T. We refer to this online learning task as the sequential choice bandit problem. For the offline combinatorial optimization problem, we show that an efficient polynomial-time algorithm exists. For the online problem, we propose an algorithm that balances exploration and exploitation, and characterize its regret bound. Lastly, we demonstrate how to extend the model with user contexts to incorporate personalization

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Carousel Personalization in Music Streaming Apps with Contextual Bandits

Author: Agarwal Alekh
Chu Wei
Garivier Aurélien
Jiang Ray
Katariya Sumeet
Komiyama Junpei
Kveton Branislav
Wang Zhiyang
Zhang Shuai
Zhou Li
Zoghi Masrour
Zong Shi
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 30/09/2020
Field of study

Media services providers, such as music streaming platforms, frequently leverage swipeable carousels to recommend personalized content to their users. However, selecting the most relevant items (albums, artists, playlists...) to display in these carousels is a challenging task, as items are numerous and as users have different preferences. In this paper, we model carousel personalization as a contextual multi-armed bandit problem with multiple plays, cascade-based updates and delayed batch feedback. We empirically show the effectiveness of our framework at capturing characteristics of real-world carousels by addressing a large-scale playlist recommendation task on a global music streaming mobile app. Along with this paper, we publicly release industrial data from our experiments, as well as an open-source environment to simulate comparable carousel personalization learning problems.Comment: 14th ACM Conference on Recommender Systems (RecSys 2020, Best Short Paper Candidate

arXiv.org e-Print Archive

Crossref