2,263 research outputs found

    Incentive-aware Contextual Pricing with Non-parametric Market Noise

    Full text link
    We consider a dynamic pricing problem for repeated contextual second-price auctions with strategic buyers whose goals are to maximize their long-term time discounted utility. The seller has very limited information about buyers' overall demand curves, which depends on dd-dimensional context vectors characterizing auctioned items, and a non-parametric market noise distribution that captures buyers' idiosyncratic tastes. The noise distribution and the relationship between the context vectors and buyers' demand curves are both unknown to the seller. We focus on designing the seller's learning policy to set contextual reserve prices where the seller's goal is to minimize his regret for revenue. We first propose a pricing policy when buyers are truthful and show that it achieves a TT-period regret bound of O~(dT)\tilde{\mathcal{O}}(\sqrt{dT}) against a clairvoyant policy that has full information of the buyers' demand. Next, under the setting where buyers bid strategically to maximize their long-term discounted utility, we develop a variant of our first policy that is robust to strategic (corrupted) bids. This policy incorporates randomized "isolation" periods, during which a buyer is randomly chosen to solely participate in the auction. We show that this design allows the seller to control the number of periods in which buyers significantly corrupt their bids. Because of this nice property, our robust policy enjoys a TT-period regret of O~(dT)\tilde{\mathcal{O}}(\sqrt{dT}), matching that under the truthful setting up to a constant factor that depends on the utility discount factor

    Selling to a No-Regret Buyer

    Full text link
    We consider the problem of a single seller repeatedly selling a single item to a single buyer (specifically, the buyer has a value drawn fresh from known distribution DD in every round). Prior work assumes that the buyer is fully rational and will perfectly reason about how their bids today affect the seller's decisions tomorrow. In this work we initiate a different direction: the buyer simply runs a no-regret learning algorithm over possible bids. We provide a fairly complete characterization of optimal auctions for the seller in this domain. Specifically: - If the buyer bids according to EXP3 (or any "mean-based" learning algorithm), then the seller can extract expected revenue arbitrarily close to the expected welfare. This auction is independent of the buyer's valuation DD, but somewhat unnatural as it is sometimes in the buyer's interest to overbid. - There exists a learning algorithm A\mathcal{A} such that if the buyer bids according to A\mathcal{A} then the optimal strategy for the seller is simply to post the Myerson reserve for DD every round. - If the buyer bids according to EXP3 (or any "mean-based" learning algorithm), but the seller is restricted to "natural" auction formats where overbidding is dominated (e.g. Generalized First-Price or Generalized Second-Price), then the optimal strategy for the seller is a pay-your-bid format with decreasing reserves over time. Moreover, the seller's optimal achievable revenue is characterized by a linear program, and can be unboundedly better than the best truthful auction yet simultaneously unboundedly worse than the expected welfare

    Contextual Standard Auctions with Budgets: Revenue Equivalence and Efficiency Guarantees

    Full text link
    The internet advertising market is a multi-billion dollar industry, in which advertisers buy thousands of ad placements every day by repeatedly participating in auctions. In recent years, the industry has shifted to first-price auctions as the preferred paradigm for selling advertising slots. Another important and ubiquitous feature of these auctions is the presence of campaign budgets, which specify the maximum amount the advertisers are willing to pay over a specified time period. In this paper, we present a new model to study the equilibrium bidding strategies in standard auctions, a large class of auctions that includes first- and second-price auctions, for advertisers who satisfy budget constraints on average. Our model dispenses with the common, yet unrealistic assumption that advertisers' values are independent and instead assumes a contextual model in which advertisers determine their values using a common feature vector. We show the existence of a natural value-pacing-based Bayes-Nash equilibrium under very mild assumptions. Furthermore, we prove a revenue equivalence showing that all standard auctions yield the same revenue even in the presence of budget constraints. Leveraging this equivalence, we prove Price of Anarchy bounds for liquid welfare and structural properties of pacing-based equilibria that hold for all standard auctions. Our work takes an important step toward understanding the implications of the shift to first-price auctions in internet advertising markets

    Adversarial learning for revenue-maximizing auctions

    Full text link
    We introduce a new numerical framework to learn optimal bidding strategies in repeated auctions when the seller uses past bids to optimize her mechanism. Crucially, we do not assume that the bidders know what optimization mechanism is used by the seller. We recover essentially all state-of-the-art analytical results for the single-item framework derived previously in the setup where the bidder knows the optimization mechanism used by the seller and extend our approach to multi-item settings, in which no optimal shading strategies were previously known. Our approach yields substantial increases in bidder utility in all settings. Our approach also has a strong potential for practical usage since it provides a simple way to optimize bidding strategies on modern marketplaces where buyers face unknown data-driven mechanisms

    Low-Regret Algorithms for Strategic Buyers with Unknown Valuations in Repeated Posted-Price Auctions

    Get PDF
    We study repeated posted-price auctions where a single seller repeatedly interacts with a single buyer for a number of rounds. In previous works, it is common to consider that the buyer knows his own valuation with certainty. However, in many practical situations, the buyer may have a stochastic valuation. In this paper, we study repeated posted-price auctions from the perspective of a utility maximizing buyer who does not know the probability distribution of his valuation and only observes a sample from the valuation distribution after he purchases the item. We first consider non-strategic buyers and derive algorithms with sub-linear regret bounds that hold irrespective of the observed prices offered by the seller. These algorithms are then adapted into algorithms with similar guarantees for strategic buyers. We provide a theoretical analysis of our proposed algorithms and support our findings with numerical experiments. Our experiments show that, if the seller uses a low-regret algorithm for selecting the price, then strategic buyers can obtain much higher utilities compared to non-strategic buyers. Only when the prices of the seller are not related to the choices of the buyer, it is not beneficial to be strategic, but strategic buyers can still attain utilities of about 75% of the utility of non-strategic buyers.</p

    Optimal No-regret Learning in Repeated First-price Auctions

    Full text link
    We study online learning in repeated first-price auctions with censored feedback, where a bidder, only observing the winning bid at the end of each auction, learns to adaptively bid in order to maximize her cumulative payoff. To achieve this goal, the bidder faces a challenging dilemma: if she wins the bid--the only way to achieve positive payoffs--then she is not able to observe the highest bid of the other bidders, which we assume is iid drawn from an unknown distribution. This dilemma, despite being reminiscent of the exploration-exploitation trade-off in contextual bandits, cannot directly be addressed by the existing UCB or Thompson sampling algorithms in that literature, mainly because contrary to the standard bandits setting, when a positive reward is obtained here, nothing about the environment can be learned. In this paper, by exploiting the structural properties of first-price auctions, we develop the first learning algorithm that achieves O(Tlog2T)O(\sqrt{T}\log^2 T) regret bound when the bidder's private values are stochastically generated. We do so by providing an algorithm on a general class of problems, which we call monotone group contextual bandits, where the same regret bound is established under stochastically generated contexts. Further, by a novel lower bound argument, we characterize an Ω(T2/3)\Omega(T^{2/3}) lower bound for the case where the contexts are adversarially generated, thus highlighting the impact of the contexts generation mechanism on the fundamental learning limit. Despite this, we further exploit the structure of first-price auctions and develop a learning algorithm that operates sample-efficiently (and computationally efficiently) in the presence of adversarially generated private values. We establish an O(Tlog3T)O(\sqrt{T}\log^3 T) regret bound for this algorithm, hence providing a complete characterization of optimal learning guarantees for this problem
    corecore