Search CORE

8 research outputs found

Carousel Personalization in Music Streaming Apps with Contextual Bandits

Author: Agarwal Alekh
Chu Wei
Garivier Aurélien
Jiang Ray
Katariya Sumeet
Komiyama Junpei
Kveton Branislav
Wang Zhiyang
Zhang Shuai
Zhou Li
Zoghi Masrour
Zong Shi
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 30/09/2020
Field of study

Media services providers, such as music streaming platforms, frequently leverage swipeable carousels to recommend personalized content to their users. However, selecting the most relevant items (albums, artists, playlists...) to display in these carousels is a challenging task, as items are numerous and as users have different preferences. In this paper, we model carousel personalization as a contextual multi-armed bandit problem with multiple plays, cascade-based updates and delayed batch feedback. We empirically show the effectiveness of our framework at capturing characteristics of real-world carousels by addressing a large-scale playlist recommendation task on a global music streaming mobile app. Along with this paper, we publicly release industrial data from our experiments, as well as an open-source environment to simulate comparable carousel personalization learning problems.Comment: 14th ACM Conference on Recommender Systems (RecSys 2020, Best Short Paper Candidate

arXiv.org e-Print Archive

Crossref

SAMPLE-BASED DYNAMIC HIERARCHICAL TRANSFORMER WITH LAYER AND HEAD FLEXIBILITY VIA CONTEXTUAL BANDIT

Author: Fanfei Meng
Lele Zhang
Yu Chen
Yuxin Wang
Publication venue: University of Kragujevac
Publication date: 01/06/2024
Field of study

Transformer requires a fixed number of layers and heads which makes them inflexible to the complexity of individual samples and expensive in training and inference. To address this, we propose a sample-based Dynamic Hierarchical Transformer (DHT) model whose layers and heads can be dynamically configured with single data samples via solving contextual bandit problems. To determine the number of layers and heads, we use the Uniform Confidence Bound algorithm while we deploy combinatorial Thompson Sampling in order to select specific head combinations given their number. Different from previous work that focuses on compressing trained networks for inference only, DHT is not only advantageous for adaptively optimizing the underlying network architecture during training but also has a flexible network for efficient inference. To the best of our knowledge, this is the first comprehensive data-driven dynamic transformer without any additional auxiliary neural networks that implement the dynamic system. According to the experiment results, we achieve up to 74% computational savings for both training and inference with a minimal loss of accuracy

Directory of Open Access Journals

Efficient Ordered Combinatorial Semi-Bandits for Whole-Page Recommendation

Author: Asamov Tsvetan
Chang Yi
Chen Jianhui
Ouyang Hua
Wang Chu
Wang Yingfei
Publication venue: Association for the Advancement of Artificial Intelligence
Publication date: 13/02/2017
Field of study

Multi-Armed Bandit (MAB) framework has been successfully applied in many web applications. However, many complex real-world applications that involve multiple content recommendations cannot fit into the traditional MAB setting. To address this issue, we consider an ordered combinatorial semi-bandit problem where the learner recommends S actions from a base set of K actions, and displays the results in S (out of M) different positions. The aim is to maximize the cumulative reward with respect to the best possible subset and positions in hindsight. By the adaptation of a minimum-cost maximum-flow network, a practical algorithm based on Thompson sampling is derived for the (contextual) combinatorial problem, thus resolving the problem of computational intractability.With its potential to work with whole-page recommendation and any probabilistic models, to illustrate the effectiveness of our method, we focus on Gaussian process optimization and a contextual setting where click-through rate is predicted using logistic regression. We demonstrate the algorithms’ performance on synthetic Gaussian process problems and on large-scale news article recommendation datasets from Yahoo! Front Page Today Module

Association for the Advancement of Artificial Intelligence: AAAI Publications