7,094 research outputs found
Optimal Real-Time Bidding for Display Advertising
Real-Time Bidding (RTB) is revolutionising display advertising by facilitating a real-time auction for each ad impression. As they are able to use impression-level data, such as user cookies and context information, advertisers can adaptively bid for each ad impression. Therefore, it is important that an advertiser designs an effective bidding strategy which can be abstracted as a function - mapping from the information of a specific ad impression to the bid price. Exactly how this bidding function should be designed is a non-trivial problem. It is a problem which involves multiple factors, such as the campaign-specific key performance indicator (KPI), the campaign lifetime auction volume and the budget. This thesis is focused on the design of automatic solutions to this problem of creating optimised bidding strategies for RTB auctions: strategies which are optimal, that is, from the perspective of an advertiser agent - to maximise the campaign's KPI in relation to the constraints of the auction volume and the budget. The problem is mathematically formulated as a functional optimisation framework where the optimal bidding function can be derived without any functional form restriction. Beyond single-campaign bid optimisation, the proposed framework can be extended to multi-campaign cases, where a portfolio-optimisation solution of auction volume reallocation is performed to maximise the overall profit with a controlled risk. On the model learning side, an unbiased learning scheme is proposed to address the data bias problem resulting from the ad auction selection, where we derive a "bid-aware'' gradient descent algorithm to train unbiased models. Moreover, the robustness of achieving the expected KPIs in a dynamic RTB market is solved with a feedback control mechanism for bid adjustment. To support the theoretic derivations, extensive experiments are carried out based on large-scale real-world data. The proposed solutions have been deployed in three commercial RTB systems in China and the United States. The online A/B tests have demonstrated substantial improvement of the proposed solutions over strong baselines
Real-Time Bidding by Reinforcement Learning in Display Advertising
The majority of online display ads are served through real-time bidding (RTB)
--- each ad display impression is auctioned off in real-time when it is just
being generated from a user visit. To place an ad automatically and optimally,
it is critical for advertisers to devise a learning algorithm to cleverly bid
an ad impression in real-time. Most previous works consider the bid decision as
a static optimization problem of either treating the value of each impression
independently or setting a bid price to each segment of ad volume. However, the
bidding for a given ad campaign would repeatedly happen during its life span
before the budget runs out. As such, each bid is strategically correlated by
the constrained budget and the overall effectiveness of the campaign (e.g., the
rewards from generated clicks), which is only observed after the campaign has
completed. Thus, it is of great interest to devise an optimal bidding strategy
sequentially so that the campaign budget can be dynamically allocated across
all the available impressions on the basis of both the immediate and future
rewards. In this paper, we formulate the bid decision process as a
reinforcement learning problem, where the state space is represented by the
auction information and the campaign's real-time parameters, while an action is
the bid price to set. By modeling the state transition via auction competition,
we build a Markov Decision Process framework for learning the optimal bidding
policy to optimize the advertising performance in the dynamic real-time bidding
environment. Furthermore, the scalability problem from the large real-world
auction volume and campaign budget is well handled by state value approximation
using neural networks.Comment: WSDM 201
Bid Optimization by Multivariable Control in Display Advertising
Real-Time Bidding (RTB) is an important paradigm in display advertising,
where advertisers utilize extended information and algorithms served by Demand
Side Platforms (DSPs) to improve advertising performance. A common problem for
DSPs is to help advertisers gain as much value as possible with budget
constraints. However, advertisers would routinely add certain key performance
indicator (KPI) constraints that the advertising campaign must meet due to
practical reasons. In this paper, we study the common case where advertisers
aim to maximize the quantity of conversions, and set cost-per-click (CPC) as a
KPI constraint. We convert such a problem into a linear programming problem and
leverage the primal-dual method to derive the optimal bidding strategy. To
address the applicability issue, we propose a feedback control-based solution
and devise the multivariable control system. The empirical study based on
real-word data from Taobao.com verifies the effectiveness and superiority of
our approach compared with the state of the art in the industry practices
A dynamic pricing model for unifying programmatic guarantee and real-time bidding in display advertising
There are two major ways of selling impressions in display advertising. They
are either sold in spot through auction mechanisms or in advance via guaranteed
contracts. The former has achieved a significant automation via real-time
bidding (RTB); however, the latter is still mainly done over the counter
through direct sales. This paper proposes a mathematical model that allocates
and prices the future impressions between real-time auctions and guaranteed
contracts. Under conventional economic assumptions, our model shows that the
two ways can be seamless combined programmatically and the publisher's revenue
can be maximized via price discrimination and optimal allocation. We consider
advertisers are risk-averse, and they would be willing to purchase guaranteed
impressions if the total costs are less than their private values. We also
consider that an advertiser's purchase behavior can be affected by both the
guaranteed price and the time interval between the purchase time and the
impression delivery date. Our solution suggests an optimal percentage of future
impressions to sell in advance and provides an explicit formula to calculate at
what prices to sell. We find that the optimal guaranteed prices are dynamic and
are non-decreasing over time. We evaluate our method with RTB datasets and find
that the model adopts different strategies in allocation and pricing according
to the level of competition. From the experiments we find that, in a less
competitive market, lower prices of the guaranteed contracts will encourage the
purchase in advance and the revenue gain is mainly contributed by the increased
competition in future RTB. In a highly competitive market, advertisers are more
willing to purchase the guaranteed contracts and thus higher prices are
expected. The revenue gain is largely contributed by the guaranteed selling.Comment: Chen, Bowei and Yuan, Shuai and Wang, Jun (2014) A dynamic pricing
model for unifying programmatic guarantee and real-time bidding in display
advertising. In: The Eighth International Workshop on Data Mining for Online
Advertising, 24 - 27 August 2014, New York Cit
Real-time Bidding for Online Advertising: Measurement and Analysis
The real-time bidding (RTB), aka programmatic buying, has recently become the
fastest growing area in online advertising. Instead of bulking buying and
inventory-centric buying, RTB mimics stock exchanges and utilises computer
algorithms to automatically buy and sell ads in real-time; It uses per
impression context and targets the ads to specific people based on data about
them, and hence dramatically increases the effectiveness of display
advertising. In this paper, we provide an empirical analysis and measurement of
a production ad exchange. Using the data sampled from both demand and supply
side, we aim to provide first-hand insights into the emerging new impression
selling infrastructure and its bidding behaviours, and help identifying
research and design issues in such systems. From our study, we observed that
periodic patterns occur in various statistics including impressions, clicks,
bids, and conversion rates (both post-view and post-click), which suggest
time-dependent models would be appropriate for capturing the repeated patterns
in RTB. We also found that despite the claimed second price auction, the first
price payment in fact is accounted for 55.4% of total cost due to the
arrangement of the soft floor price. As such, we argue that the setting of soft
floor price in the current RTB systems puts advertisers in a less favourable
position. Furthermore, our analysis on the conversation rates shows that the
current bidding strategy is far less optimal, indicating the significant needs
for optimisation algorithms incorporating the facts such as the temporal
behaviours, the frequency and recency of the ad displays, which have not been
well considered in the past.Comment: Accepted by ADKDD '13 worksho
Statistical Arbitrage Mining for Display Advertising
We study and formulate arbitrage in display advertising. Real-Time Bidding
(RTB) mimics stock spot exchanges and utilises computers to algorithmically buy
display ads per impression via a real-time auction. Despite the new automation,
the ad markets are still informationally inefficient due to the heavily
fragmented marketplaces. Two display impressions with similar or identical
effectiveness (e.g., measured by conversion or click-through rates for a
targeted audience) may sell for quite different prices at different market
segments or pricing schemes. In this paper, we propose a novel data mining
paradigm called Statistical Arbitrage Mining (SAM) focusing on mining and
exploiting price discrepancies between two pricing schemes. In essence, our
SAMer is a meta-bidder that hedges advertisers' risk between CPA (cost per
action)-based campaigns and CPM (cost per mille impressions)-based ad
inventories; it statistically assesses the potential profit and cost for an
incoming CPM bid request against a portfolio of CPA campaigns based on the
estimated conversion rate, bid landscape and other statistics learned from
historical data. In SAM, (i) functional optimisation is utilised to seek for
optimal bidding to maximise the expected arbitrage net profit, and (ii) a
portfolio-based risk management solution is leveraged to reallocate bid volume
and budget across the set of campaigns to make a risk and return trade-off. We
propose to jointly optimise both components in an EM fashion with high
efficiency to help the meta-bidder successfully catch the transient statistical
arbitrage opportunities in RTB. Both the offline experiments on a real-world
large-scale dataset and online A/B tests on a commercial platform demonstrate
the effectiveness of our proposed solution in exploiting arbitrage in various
model settings and market environments.Comment: In the proceedings of the 21st ACM SIGKDD international conference on
Knowledge discovery and data mining (KDD 2015
Budget Constrained Bidding by Model-free Reinforcement Learning in Display Advertising
Real-time bidding (RTB) is an important mechanism in online display
advertising, where a proper bid for each page view plays an essential role for
good marketing results. Budget constrained bidding is a typical scenario in RTB
where the advertisers hope to maximize the total value of the winning
impressions under a pre-set budget constraint. However, the optimal bidding
strategy is hard to be derived due to the complexity and volatility of the
auction environment. To address these challenges, in this paper, we formulate
budget constrained bidding as a Markov Decision Process and propose a
model-free reinforcement learning framework to resolve the optimization
problem. Our analysis shows that the immediate reward from environment is
misleading under a critical resource constraint. Therefore, we innovate a
reward function design methodology for the reinforcement learning problems with
constraints. Based on the new reward design, we employ a deep neural network to
learn the appropriate reward so that the optimal policy can be learned
effectively. Different from the prior model-based work, which suffers from the
scalability problem, our framework is easy to be deployed in large-scale
industrial applications. The experimental evaluations demonstrate the
effectiveness of our framework on large-scale real datasets.Comment: In The 27th ACM International Conference on Information and Knowledge
Management (CIKM 18), October 22-26, 2018, Torino, Italy. ACM, New York, NY,
USA, 9 page
- …