27 research outputs found

    Dynamic Ad Allocation: Bandits with Budgets

    Full text link
    We consider an application of multi-armed bandits to internet advertising (specifically, to dynamic ad allocation in the pay-per-click model, with uncertainty on the click probabilities). We focus on an important practical issue that advertisers are constrained in how much money they can spend on their ad campaigns. This issue has not been considered in the prior work on bandit-based approaches for ad allocation, to the best of our knowledge. We define a simple, stylized model where an algorithm picks one ad to display in each round, and each ad has a \emph{budget}: the maximal amount of money that can be spent on this ad. This model admits a natural variant of UCB1, a well-known algorithm for multi-armed bandits with stochastic rewards. We derive strong provable guarantees for this algorithm

    Learning Prices for Repeated Auctions with Strategic Buyers

    Full text link
    Inspired by real-time ad exchanges for online display advertising, we consider the problem of inferring a buyer's value distribution for a good when the buyer is repeatedly interacting with a seller through a posted-price mechanism. We model the buyer as a strategic agent, whose goal is to maximize her long-term surplus, and we are interested in mechanisms that maximize the seller's long-term revenue. We define the natural notion of strategic regret --- the lost revenue as measured against a truthful (non-strategic) buyer. We present seller algorithms that are no-(strategic)-regret when the buyer discounts her future surplus --- i.e. the buyer prefers showing advertisements to users sooner rather than later. We also give a lower bound on strategic regret that increases as the buyer's discounting weakens and shows, in particular, that any seller algorithm will suffer linear strategic regret if there is no discounting.Comment: Neural Information Processing Systems (NIPS 2013

    Online learning in repeated auctions

    Full text link
    Motivated by online advertising auctions, we consider repeated Vickrey auctions where goods of unknown value are sold sequentially and bidders only learn (potentially noisy) information about a good's value once it is purchased. We adopt an online learning approach with bandit feedback to model this problem and derive bidding strategies for two models: stochastic and adversarial. In the stochastic model, the observed values of the goods are random variables centered around the true value of the good. In this case, logarithmic regret is achievable when competing against well behaved adversaries. In the adversarial model, the goods need not be identical and we simply compare our performance against that of the best fixed bid in hindsight. We show that sublinear regret is also achievable in this case and prove matching minimax lower bounds. To our knowledge, this is the first complete set of strategies for bidders participating in auctions of this type

    Algorithms as Mechanisms: The Price of Anarchy of Relax-and-Round

    Full text link
    Many algorithms that are originally designed without explicitly considering incentive properties are later combined with simple pricing rules and used as mechanisms. The resulting mechanisms are often natural and simple to understand. But how good are these algorithms as mechanisms? Truthful reporting of valuations is typically not a dominant strategy (certainly not with a pay-your-bid, first-price rule, but it is likely not a good strategy even with a critical value, or second-price style rule either). Our goal is to show that a wide class of approximation algorithms yields this way mechanisms with low Price of Anarchy. The seminal result of Lucier and Borodin [SODA 2010] shows that combining a greedy algorithm that is an α\alpha-approximation algorithm with a pay-your-bid payment rule yields a mechanism whose Price of Anarchy is O(α)O(\alpha). In this paper we significantly extend the class of algorithms for which such a result is available by showing that this close connection between approximation ratio on the one hand and Price of Anarchy on the other also holds for the design principle of relaxation and rounding provided that the relaxation is smooth and the rounding is oblivious. We demonstrate the far-reaching consequences of our result by showing its implications for sparse packing integer programs, such as multi-unit auctions and generalized matching, for the maximum traveling salesman problem, for combinatorial auctions, and for single source unsplittable flow problems. In all these problems our approach leads to novel simple, near-optimal mechanisms whose Price of Anarchy either matches or beats the performance guarantees of known mechanisms.Comment: Extended abstract appeared in Proc. of 16th ACM Conference on Economics and Computation (EC'15
    corecore