4 research outputs found
ROI-Constrained Bidding via Curriculum-Guided Bayesian Reinforcement Learning
Real-Time Bidding (RTB) is an important mechanism in modern online
advertising systems. Advertisers employ bidding strategies in RTB to optimize
their advertising effects subject to various financial requirements, especially
the return-on-investment (ROI) constraint. ROIs change non-monotonically during
the sequential bidding process, and often induce a see-saw effect between
constraint satisfaction and objective optimization. While some existing
approaches show promising results in static or mildly changing ad markets, they
fail to generalize to highly dynamic ad markets with ROI constraints, due to
their inability to adaptively balance constraints and objectives amidst
non-stationarity and partial observability. In this work, we specialize in
ROI-Constrained Bidding in non-stationary markets. Based on a Partially
Observable Constrained Markov Decision Process, our method exploits an
indicator-augmented reward function free of extra trade-off parameters and
develops a Curriculum-Guided Bayesian Reinforcement Learning (CBRL) framework
to adaptively control the constraint-objective trade-off in non-stationary ad
markets. Extensive experiments on a large-scale industrial dataset with two
problem settings reveal that CBRL generalizes well in both in-distribution and
out-of-distribution data regimes, and enjoys superior learning efficiency and
stability.Comment: Accepted by SIGKDD 202