11,039 research outputs found
Real-Time Bidding by Reinforcement Learning in Display Advertising
The majority of online display ads are served through real-time bidding (RTB)
--- each ad display impression is auctioned off in real-time when it is just
being generated from a user visit. To place an ad automatically and optimally,
it is critical for advertisers to devise a learning algorithm to cleverly bid
an ad impression in real-time. Most previous works consider the bid decision as
a static optimization problem of either treating the value of each impression
independently or setting a bid price to each segment of ad volume. However, the
bidding for a given ad campaign would repeatedly happen during its life span
before the budget runs out. As such, each bid is strategically correlated by
the constrained budget and the overall effectiveness of the campaign (e.g., the
rewards from generated clicks), which is only observed after the campaign has
completed. Thus, it is of great interest to devise an optimal bidding strategy
sequentially so that the campaign budget can be dynamically allocated across
all the available impressions on the basis of both the immediate and future
rewards. In this paper, we formulate the bid decision process as a
reinforcement learning problem, where the state space is represented by the
auction information and the campaign's real-time parameters, while an action is
the bid price to set. By modeling the state transition via auction competition,
we build a Markov Decision Process framework for learning the optimal bidding
policy to optimize the advertising performance in the dynamic real-time bidding
environment. Furthermore, the scalability problem from the large real-world
auction volume and campaign budget is well handled by state value approximation
using neural networks.Comment: WSDM 201
Statistical Arbitrage Mining for Display Advertising
We study and formulate arbitrage in display advertising. Real-Time Bidding
(RTB) mimics stock spot exchanges and utilises computers to algorithmically buy
display ads per impression via a real-time auction. Despite the new automation,
the ad markets are still informationally inefficient due to the heavily
fragmented marketplaces. Two display impressions with similar or identical
effectiveness (e.g., measured by conversion or click-through rates for a
targeted audience) may sell for quite different prices at different market
segments or pricing schemes. In this paper, we propose a novel data mining
paradigm called Statistical Arbitrage Mining (SAM) focusing on mining and
exploiting price discrepancies between two pricing schemes. In essence, our
SAMer is a meta-bidder that hedges advertisers' risk between CPA (cost per
action)-based campaigns and CPM (cost per mille impressions)-based ad
inventories; it statistically assesses the potential profit and cost for an
incoming CPM bid request against a portfolio of CPA campaigns based on the
estimated conversion rate, bid landscape and other statistics learned from
historical data. In SAM, (i) functional optimisation is utilised to seek for
optimal bidding to maximise the expected arbitrage net profit, and (ii) a
portfolio-based risk management solution is leveraged to reallocate bid volume
and budget across the set of campaigns to make a risk and return trade-off. We
propose to jointly optimise both components in an EM fashion with high
efficiency to help the meta-bidder successfully catch the transient statistical
arbitrage opportunities in RTB. Both the offline experiments on a real-world
large-scale dataset and online A/B tests on a commercial platform demonstrate
the effectiveness of our proposed solution in exploiting arbitrage in various
model settings and market environments.Comment: In the proceedings of the 21st ACM SIGKDD international conference on
Knowledge discovery and data mining (KDD 2015
- …