237 research outputs found
Asymptotically Optimal Algorithms for Budgeted Multiple Play Bandits
We study a generalization of the multi-armed bandit problem with multiple
plays where there is a cost associated with pulling each arm and the agent has
a budget at each time that dictates how much she can expect to spend. We derive
an asymptotic regret lower bound for any uniformly efficient algorithm in our
setting. We then study a variant of Thompson sampling for Bernoulli rewards and
a variant of KL-UCB for both single-parameter exponential families and bounded,
finitely supported rewards. We show these algorithms are asymptotically
optimal, both in rateand leading problem-dependent constants, including in the
thick margin setting where multiple arms fall on the decision boundary
Dynamic Learning of Sequential Choice Bandit Problem under Marketing Fatigue
Motivated by the observation that overexposure to unwanted marketing
activities leads to customer dissatisfaction, we consider a setting where a
platform offers a sequence of messages to its users and is penalized when users
abandon the platform due to marketing fatigue. We propose a novel sequential
choice model to capture multiple interactions taking place between the platform
and its user: Upon receiving a message, a user decides on one of the three
actions: accept the message, skip and receive the next message, or abandon the
platform. Based on user feedback, the platform dynamically learns users'
abandonment distribution and their valuations of messages to determine the
length of the sequence and the order of the messages, while maximizing the
cumulative payoff over a horizon of length T. We refer to this online learning
task as the sequential choice bandit problem. For the offline combinatorial
optimization problem, we show that an efficient polynomial-time algorithm
exists. For the online problem, we propose an algorithm that balances
exploration and exploitation, and characterize its regret bound. Lastly, we
demonstrate how to extend the model with user contexts to incorporate
personalization
- …