1 research outputs found
Learning in Repeated Multi-Unit Pay-As-Bid Auctions
Motivated by Carbon Emissions Trading Schemes, Treasury Auctions, and
Procurement Auctions, which all involve the auctioning of homogeneous multiple
units, we consider the problem of learning how to bid in repeated multi-unit
pay-as-bid auctions. In each of these auctions, a large number of (identical)
items are to be allocated to the largest submitted bids, where the price of
each of the winning bids is equal to the bid itself. The problem of learning
how to bid in pay-as-bid auctions is challenging due to the combinatorial
nature of the action space. We overcome this challenge by focusing on the
offline setting, where the bidder optimizes their vector of bids while only
having access to the past submitted bids by other bidders. We show that the
optimal solution to the offline problem can be obtained using a polynomial time
dynamic programming (DP) scheme. We leverage the structure of the DP scheme to
design online learning algorithms with polynomial time and space complexity
under full information and bandit feedback settings. We achieve an upper bound
on regret of and respectively, where is the number of units demanded by the
bidder, is the total number of auctions, and is the size of
the discretized bid space. We accompany these results with a regret lower
bound, which match the linear dependency in . Our numerical results suggest
that when all agents behave according to our proposed no regret learning
algorithms, the resulting market dynamics mainly converge to a welfare
maximizing equilibrium where bidders submit uniform bids. Lastly, our
experiments demonstrate that the pay-as-bid auction consistently generates
significantly higher revenue compared to its popular alternative, the uniform
price auction.Comment: 51 pages, 12 Figure