1,540 research outputs found
A Game-theoretic Machine Learning Approach for Revenue Maximization in Sponsored Search
Sponsored search is an important monetization channel for search engines, in
which an auction mechanism is used to select the ads shown to users and
determine the prices charged from advertisers. There have been several pieces
of work in the literature that investigate how to design an auction mechanism
in order to optimize the revenue of the search engine. However, due to some
unrealistic assumptions used, the practical values of these studies are not
very clear. In this paper, we propose a novel \emph{game-theoretic machine
learning} approach, which naturally combines machine learning and game theory,
and learns the auction mechanism using a bilevel optimization framework. In
particular, we first learn a Markov model from historical data to describe how
advertisers change their bids in response to an auction mechanism, and then for
any given auction mechanism, we use the learnt model to predict its
corresponding future bid sequences. Next we learn the auction mechanism through
empirical revenue maximization on the predicted bid sequences. We show that the
empirical revenue will converge when the prediction period approaches infinity,
and a Genetic Programming algorithm can effectively optimize this empirical
revenue. Our experiments indicate that the proposed approach is able to produce
a much more effective auction mechanism than several baselines.Comment: Twenty-third International Conference on Artificial Intelligence
(IJCAI 2013
Real-Time Bidding by Reinforcement Learning in Display Advertising
The majority of online display ads are served through real-time bidding (RTB)
--- each ad display impression is auctioned off in real-time when it is just
being generated from a user visit. To place an ad automatically and optimally,
it is critical for advertisers to devise a learning algorithm to cleverly bid
an ad impression in real-time. Most previous works consider the bid decision as
a static optimization problem of either treating the value of each impression
independently or setting a bid price to each segment of ad volume. However, the
bidding for a given ad campaign would repeatedly happen during its life span
before the budget runs out. As such, each bid is strategically correlated by
the constrained budget and the overall effectiveness of the campaign (e.g., the
rewards from generated clicks), which is only observed after the campaign has
completed. Thus, it is of great interest to devise an optimal bidding strategy
sequentially so that the campaign budget can be dynamically allocated across
all the available impressions on the basis of both the immediate and future
rewards. In this paper, we formulate the bid decision process as a
reinforcement learning problem, where the state space is represented by the
auction information and the campaign's real-time parameters, while an action is
the bid price to set. By modeling the state transition via auction competition,
we build a Markov Decision Process framework for learning the optimal bidding
policy to optimize the advertising performance in the dynamic real-time bidding
environment. Furthermore, the scalability problem from the large real-world
auction volume and campaign budget is well handled by state value approximation
using neural networks.Comment: WSDM 201
Born to trade: a genetically evolved keyword bidder for sponsored search
In sponsored search auctions, advertisers choose a set of keywords based on products they wish to market. They bid for advertising slots that will be displayed on the search results page when a user submits a query containing the keywords that the advertiser selected. Deciding how much to bid is a real challenge: if the bid is too low with respect to the bids of other advertisers, the ad might not get displayed in a favorable position; a bid that is too high on the other hand might not be profitable either, since the attracted number of conversions might not be enough to compensate for the high cost per click.
In this paper we propose a genetically evolved keyword bidding strategy that decides how much to bid for each query based on historical data such as the position obtained on the previous day. In light of the fact that our approach does not implement any particular expert knowledge on keyword auctions, it did remarkably well in the Trading Agent Competition at IJCAI2009
Econometrics for Learning Agents
The main goal of this paper is to develop a theory of inference of player
valuations from observed data in the generalized second price auction without
relying on the Nash equilibrium assumption. Existing work in Economics on
inferring agent values from data relies on the assumption that all participant
strategies are best responses of the observed play of other players, i.e. they
constitute a Nash equilibrium. In this paper, we show how to perform inference
relying on a weaker assumption instead: assuming that players are using some
form of no-regret learning. Learning outcomes emerged in recent years as an
attractive alternative to Nash equilibrium in analyzing game outcomes, modeling
players who haven't reached a stable equilibrium, but rather use algorithmic
learning, aiming to learn the best way to play from previous observations. In
this paper we show how to infer values of players who use algorithmic learning
strategies. Such inference is an important first step before we move to testing
any learning theoretic behavioral model on auction data. We apply our
techniques to a dataset from Microsoft's sponsored search ad auction system
Optimizing Your Online-Advertisement Asynchronously
We consider the problem of designing optimal online-ad investment strategies
for a single advertiser, who invests at multiple sponsored search sites
simultaneously, with the objective of maximizing his average revenue subject to
the advertising budget constraint. A greedy online investment scheme is
developed to achieve an average revenue that can be pushed to within
of the optimal, for any , with a tradeoff that the
temporal budget violation is . Different from many existing
algorithms, our scheme allows the advertiser to \emph{asynchronously} update
his investments on each search engine site, hence applies to systems where the
timescales of action update intervals are heterogeneous for different sites. We
also quantify the impact of inaccurate estimation of the system dynamics and
show that the algorithm is robust against imperfect system knowledge
- …