2,263 research outputs found
Incentive-aware Contextual Pricing with Non-parametric Market Noise
We consider a dynamic pricing problem for repeated contextual second-price
auctions with strategic buyers whose goals are to maximize their long-term time
discounted utility. The seller has very limited information about buyers'
overall demand curves, which depends on -dimensional context vectors
characterizing auctioned items, and a non-parametric market noise distribution
that captures buyers' idiosyncratic tastes. The noise distribution and the
relationship between the context vectors and buyers' demand curves are both
unknown to the seller. We focus on designing the seller's learning policy to
set contextual reserve prices where the seller's goal is to minimize his regret
for revenue. We first propose a pricing policy when buyers are truthful and
show that it achieves a -period regret bound of
against a clairvoyant policy that has full
information of the buyers' demand. Next, under the setting where buyers bid
strategically to maximize their long-term discounted utility, we develop a
variant of our first policy that is robust to strategic (corrupted) bids. This
policy incorporates randomized "isolation" periods, during which a buyer is
randomly chosen to solely participate in the auction. We show that this design
allows the seller to control the number of periods in which buyers
significantly corrupt their bids. Because of this nice property, our robust
policy enjoys a -period regret of , matching
that under the truthful setting up to a constant factor that depends on the
utility discount factor
Selling to a No-Regret Buyer
We consider the problem of a single seller repeatedly selling a single item
to a single buyer (specifically, the buyer has a value drawn fresh from known
distribution in every round). Prior work assumes that the buyer is fully
rational and will perfectly reason about how their bids today affect the
seller's decisions tomorrow. In this work we initiate a different direction:
the buyer simply runs a no-regret learning algorithm over possible bids. We
provide a fairly complete characterization of optimal auctions for the seller
in this domain. Specifically:
- If the buyer bids according to EXP3 (or any "mean-based" learning
algorithm), then the seller can extract expected revenue arbitrarily close to
the expected welfare. This auction is independent of the buyer's valuation ,
but somewhat unnatural as it is sometimes in the buyer's interest to overbid. -
There exists a learning algorithm such that if the buyer bids
according to then the optimal strategy for the seller is simply
to post the Myerson reserve for every round. - If the buyer bids according
to EXP3 (or any "mean-based" learning algorithm), but the seller is restricted
to "natural" auction formats where overbidding is dominated (e.g. Generalized
First-Price or Generalized Second-Price), then the optimal strategy for the
seller is a pay-your-bid format with decreasing reserves over time. Moreover,
the seller's optimal achievable revenue is characterized by a linear program,
and can be unboundedly better than the best truthful auction yet simultaneously
unboundedly worse than the expected welfare
Contextual Standard Auctions with Budgets: Revenue Equivalence and Efficiency Guarantees
The internet advertising market is a multi-billion dollar industry, in which
advertisers buy thousands of ad placements every day by repeatedly
participating in auctions. In recent years, the industry has shifted to
first-price auctions as the preferred paradigm for selling advertising slots.
Another important and ubiquitous feature of these auctions is the presence of
campaign budgets, which specify the maximum amount the advertisers are willing
to pay over a specified time period. In this paper, we present a new model to
study the equilibrium bidding strategies in standard auctions, a large class of
auctions that includes first- and second-price auctions, for advertisers who
satisfy budget constraints on average. Our model dispenses with the common, yet
unrealistic assumption that advertisers' values are independent and instead
assumes a contextual model in which advertisers determine their values using a
common feature vector. We show the existence of a natural value-pacing-based
Bayes-Nash equilibrium under very mild assumptions. Furthermore, we prove a
revenue equivalence showing that all standard auctions yield the same revenue
even in the presence of budget constraints. Leveraging this equivalence, we
prove Price of Anarchy bounds for liquid welfare and structural properties of
pacing-based equilibria that hold for all standard auctions. Our work takes an
important step toward understanding the implications of the shift to
first-price auctions in internet advertising markets
Adversarial learning for revenue-maximizing auctions
We introduce a new numerical framework to learn optimal bidding strategies in
repeated auctions when the seller uses past bids to optimize her mechanism.
Crucially, we do not assume that the bidders know what optimization mechanism
is used by the seller. We recover essentially all state-of-the-art analytical
results for the single-item framework derived previously in the setup where the
bidder knows the optimization mechanism used by the seller and extend our
approach to multi-item settings, in which no optimal shading strategies were
previously known. Our approach yields substantial increases in bidder utility
in all settings. Our approach also has a strong potential for practical usage
since it provides a simple way to optimize bidding strategies on modern
marketplaces where buyers face unknown data-driven mechanisms
Low-Regret Algorithms for Strategic Buyers with Unknown Valuations in Repeated Posted-Price Auctions
We study repeated posted-price auctions where a single seller repeatedly interacts with a single buyer for a number of rounds. In previous works, it is common to consider that the buyer knows his own valuation with certainty. However, in many practical situations, the buyer may have a stochastic valuation. In this paper, we study repeated posted-price auctions from the perspective of a utility maximizing buyer who does not know the probability distribution of his valuation and only observes a sample from the valuation distribution after he purchases the item. We first consider non-strategic buyers and derive algorithms with sub-linear regret bounds that hold irrespective of the observed prices offered by the seller. These algorithms are then adapted into algorithms with similar guarantees for strategic buyers. We provide a theoretical analysis of our proposed algorithms and support our findings with numerical experiments. Our experiments show that, if the seller uses a low-regret algorithm for selecting the price, then strategic buyers can obtain much higher utilities compared to non-strategic buyers. Only when the prices of the seller are not related to the choices of the buyer, it is not beneficial to be strategic, but strategic buyers can still attain utilities of about 75% of the utility of non-strategic buyers.</p
Optimal No-regret Learning in Repeated First-price Auctions
We study online learning in repeated first-price auctions with censored
feedback, where a bidder, only observing the winning bid at the end of each
auction, learns to adaptively bid in order to maximize her cumulative payoff.
To achieve this goal, the bidder faces a challenging dilemma: if she wins the
bid--the only way to achieve positive payoffs--then she is not able to observe
the highest bid of the other bidders, which we assume is iid drawn from an
unknown distribution. This dilemma, despite being reminiscent of the
exploration-exploitation trade-off in contextual bandits, cannot directly be
addressed by the existing UCB or Thompson sampling algorithms in that
literature, mainly because contrary to the standard bandits setting, when a
positive reward is obtained here, nothing about the environment can be learned.
In this paper, by exploiting the structural properties of first-price
auctions, we develop the first learning algorithm that achieves
regret bound when the bidder's private values are
stochastically generated. We do so by providing an algorithm on a general class
of problems, which we call monotone group contextual bandits, where the same
regret bound is established under stochastically generated contexts. Further,
by a novel lower bound argument, we characterize an lower
bound for the case where the contexts are adversarially generated, thus
highlighting the impact of the contexts generation mechanism on the fundamental
learning limit. Despite this, we further exploit the structure of first-price
auctions and develop a learning algorithm that operates sample-efficiently (and
computationally efficiently) in the presence of adversarially generated private
values. We establish an regret bound for this algorithm,
hence providing a complete characterization of optimal learning guarantees for
this problem
- …