3 research outputs found
Incentive-aware Contextual Pricing with Non-parametric Market Noise
We consider a dynamic pricing problem for repeated contextual second-price
auctions with strategic buyers whose goals are to maximize their long-term time
discounted utility. The seller has very limited information about buyers'
overall demand curves, which depends on -dimensional context vectors
characterizing auctioned items, and a non-parametric market noise distribution
that captures buyers' idiosyncratic tastes. The noise distribution and the
relationship between the context vectors and buyers' demand curves are both
unknown to the seller. We focus on designing the seller's learning policy to
set contextual reserve prices where the seller's goal is to minimize his regret
for revenue. We first propose a pricing policy when buyers are truthful and
show that it achieves a -period regret bound of
against a clairvoyant policy that has full
information of the buyers' demand. Next, under the setting where buyers bid
strategically to maximize their long-term discounted utility, we develop a
variant of our first policy that is robust to strategic (corrupted) bids. This
policy incorporates randomized "isolation" periods, during which a buyer is
randomly chosen to solely participate in the auction. We show that this design
allows the seller to control the number of periods in which buyers
significantly corrupt their bids. Because of this nice property, our robust
policy enjoys a -period regret of , matching
that under the truthful setting up to a constant factor that depends on the
utility discount factor
Online Learning in Multi-unit Auctions
We consider repeated multi-unit auctions with uniform pricing, which are
widely used in practice for allocating goods such as carbon licenses. In each
round, identical units of a good are sold to a group of buyers that have
valuations with diminishing marginal returns. The buyers submit bids for the
units, and then a price is set per unit so that all the units are sold. We
consider two variants of the auction, where the price is set to the -th
highest bid and -st highest bid, respectively.
We analyze the properties of this auction in both the offline and online
settings. In the offline setting, we consider the problem that one player
is facing: given access to a data set that contains the bids submitted by
competitors in past auctions, find a bid vector that maximizes player 's
cumulative utility on the data set. We design a polynomial time algorithm for
this problem, by showing it is equivalent to finding a maximum-weight path on a
carefully constructed directed acyclic graph.
In the online setting, the players run learning algorithms to update their
bids as they participate in the auction over time. Based on our offline
algorithm, we design efficient online learning algorithms for bidding. The
algorithms have sublinear regret, under both full information and bandit
feedback structures. We complement our online learning algorithms with regret
lower bounds.
Finally, we analyze the quality of the equilibria in the worst case through
the lens of the core solution concept in the game among the bidders. We show
that the -st price format is susceptible to collusion among the bidders;
meanwhile, the -th price format does not have this issue