207 research outputs found
Dynamic Ad Allocation: Bandits with Budgets
We consider an application of multi-armed bandits to internet advertising
(specifically, to dynamic ad allocation in the pay-per-click model, with
uncertainty on the click probabilities). We focus on an important practical
issue that advertisers are constrained in how much money they can spend on
their ad campaigns. This issue has not been considered in the prior work on
bandit-based approaches for ad allocation, to the best of our knowledge.
We define a simple, stylized model where an algorithm picks one ad to display
in each round, and each ad has a \emph{budget}: the maximal amount of money
that can be spent on this ad. This model admits a natural variant of UCB1, a
well-known algorithm for multi-armed bandits with stochastic rewards. We derive
strong provable guarantees for this algorithm
Learning Prices for Repeated Auctions with Strategic Buyers
Inspired by real-time ad exchanges for online display advertising, we
consider the problem of inferring a buyer's value distribution for a good when
the buyer is repeatedly interacting with a seller through a posted-price
mechanism. We model the buyer as a strategic agent, whose goal is to maximize
her long-term surplus, and we are interested in mechanisms that maximize the
seller's long-term revenue. We define the natural notion of strategic regret
--- the lost revenue as measured against a truthful (non-strategic) buyer. We
present seller algorithms that are no-(strategic)-regret when the buyer
discounts her future surplus --- i.e. the buyer prefers showing advertisements
to users sooner rather than later. We also give a lower bound on strategic
regret that increases as the buyer's discounting weakens and shows, in
particular, that any seller algorithm will suffer linear strategic regret if
there is no discounting.Comment: Neural Information Processing Systems (NIPS 2013
- …