3,947 research outputs found
Learning Adaptive Display Exposure for Real-Time Advertising
In E-commerce advertising, where product recommendations and product ads are
presented to users simultaneously, the traditional setting is to display ads at
fixed positions. However, under such a setting, the advertising system loses
the flexibility to control the number and positions of ads, resulting in
sub-optimal platform revenue and user experience. Consequently, major
e-commerce platforms (e.g., Taobao.com) have begun to consider more flexible
ways to display ads. In this paper, we investigate the problem of advertising
with adaptive exposure: can we dynamically determine the number and positions
of ads for each user visit under certain business constraints so that the
platform revenue can be increased? More specifically, we consider two types of
constraints: request-level constraint ensures user experience for each user
visit, and platform-level constraint controls the overall platform monetization
rate. We model this problem as a Constrained Markov Decision Process with
per-state constraint (psCMDP) and propose a constrained two-level reinforcement
learning approach to decompose the original problem into two relatively
independent sub-problems. To accelerate policy learning, we also devise a
constrained hindsight experience replay mechanism. Experimental evaluations on
industry-scale real-world datasets demonstrate the merits of our approach in
both obtaining higher revenue under the constraints and the effectiveness of
the constrained hindsight experience replay mechanism.Comment: accepted by CIKM201
Real-Time Bidding with Multi-Agent Reinforcement Learning in Display Advertising
Real-time advertising allows advertisers to bid for each impression for a
visiting user. To optimize specific goals such as maximizing revenue and return
on investment (ROI) led by ad placements, advertisers not only need to estimate
the relevance between the ads and user's interests, but most importantly
require a strategic response with respect to other advertisers bidding in the
market. In this paper, we formulate bidding optimization with multi-agent
reinforcement learning. To deal with a large number of advertisers, we propose
a clustering method and assign each cluster with a strategic bidding agent. A
practical Distributed Coordinated Multi-Agent Bidding (DCMAB) has been proposed
and implemented to balance the tradeoff between the competition and cooperation
among advertisers. The empirical study on our industry-scaled real-world data
has demonstrated the effectiveness of our methods. Our results show
cluster-based bidding would largely outperform single-agent and bandit
approaches, and the coordinated bidding achieves better overall objectives than
purely self-interested bidding agents
Real-Time Bidding by Reinforcement Learning in Display Advertising
The majority of online display ads are served through real-time bidding (RTB)
--- each ad display impression is auctioned off in real-time when it is just
being generated from a user visit. To place an ad automatically and optimally,
it is critical for advertisers to devise a learning algorithm to cleverly bid
an ad impression in real-time. Most previous works consider the bid decision as
a static optimization problem of either treating the value of each impression
independently or setting a bid price to each segment of ad volume. However, the
bidding for a given ad campaign would repeatedly happen during its life span
before the budget runs out. As such, each bid is strategically correlated by
the constrained budget and the overall effectiveness of the campaign (e.g., the
rewards from generated clicks), which is only observed after the campaign has
completed. Thus, it is of great interest to devise an optimal bidding strategy
sequentially so that the campaign budget can be dynamically allocated across
all the available impressions on the basis of both the immediate and future
rewards. In this paper, we formulate the bid decision process as a
reinforcement learning problem, where the state space is represented by the
auction information and the campaign's real-time parameters, while an action is
the bid price to set. By modeling the state transition via auction competition,
we build a Markov Decision Process framework for learning the optimal bidding
policy to optimize the advertising performance in the dynamic real-time bidding
environment. Furthermore, the scalability problem from the large real-world
auction volume and campaign budget is well handled by state value approximation
using neural networks.Comment: WSDM 201
Real-time bidding with multi-agent reinforcement learning in display advertising
Real-time advertising allows advertisers to bid for each impression for a visiting user. To optimize specific goals such as maximizing revenue and return on investment (ROI) led by ad placements, advertisers not only need to estimate the relevance between the ads and user's interests, but most importantly require a strategic response with respect to other advertisers bidding in the market. In this paper, we formulate bidding optimization with multi-agent reinforcement learning. To deal with a large number of advertisers, we propose a clustering method and assign each cluster with a strategic bidding agent. A practical Distributed Coordinated Multi-Agent Bidding (DCMAB) has been proposed and implemented to balance the tradeoff between the competition and cooperation among advertisers. The empirical study on our industry-scaled real-world data has demonstrated the effectiveness of our methods. Our results show cluster-based bidding would largely outperform single-agent and bandit approaches, and the coordinated bidding achieves better overall objectives than purely self-interested bidding agents
- …