Search CORE

49 research outputs found

Learning to Bid in Repeated First-Price Auctions with Budgets

Author: Deng Xiaotie
Kong Yuqing
Wang Qian
Yang Zongjun
Publication venue
Publication date: 26/04/2023
Field of study

Budget management strategies in repeated auctions have received growing attention in online advertising markets. However, previous work on budget management in online bidding mainly focused on second-price auctions. The rapid shift from second-price auctions to first-price auctions for online ads in recent years has motivated the challenging question of how to bid in repeated first-price auctions while controlling budgets. In this work, we study the problem of learning in repeated first-price auctions with budgets. We design a dual-based algorithm that can achieve a near-optimal

\widetilde{O}(\sqrt{T})

regret with full information feedback where the maximum competing bid is always revealed after each auction. We further consider the setting with one-sided information feedback where only the winning bid is revealed after each auction. We show that our modified algorithm can still achieve an

\widetilde{O}(\sqrt{T})

regret with mild assumptions on the bidder's value distribution. Finally, we complement the theoretical results with numerical experiments to confirm the effectiveness of our budget management policy

arXiv.org e-Print Archive

On the complexity of computing Markov perfect equilibrium in general-sum stochastic games

Author: Deng Xiaotie
Li Ningyuan
Mguni David
Wang Jun
Yang Yaodong
Publication venue: OXFORD UNIV PRESS
Publication date: 22/11/2022
Field of study

Similar to the role of Markov decision processes in reinforcement learning, Markov games (also called stochastic games) lay down the foundation for the study of multi-agent reinforcement learning and sequential agent interactions. We introduce approximate Markov perfect equilibrium as a solution to the computational problem of finite-state stochastic games repeated in the infinite horizon and prove its PPAD-completeness. This solution concept preserves the Markov perfect property and opens up the possibility for the success of multi-agent reinforcement learning algorithms on static two-player games to be extended to multi-agent dynamic games, expanding the reign of the PPAD-complete class

UCL Discovery

PubMed Central

On the Re-Solving Heuristic for (Binary) Contextual Bandits with Knapsacks

Author: Ai Rui
Chen Zhaohua
Deng Xiaotie
Pan Yuqi
Wang Chang
Yang Mingwei
Publication venue
Publication date: 25/11/2022
Field of study

In the problem of (binary) contextual bandits with knapsacks (CBwK), the agent receives an i.i.d. context in each of the

T

rounds and chooses an action, resulting in a random reward and a random consumption of resources that are related to an i.i.d. external factor. The agent's goal is to maximize the accumulated reward under the initial resource constraints. In this work, we combine the re-solving heuristic, which proved successful in revenue management, with distribution estimation techniques to solve this problem. We consider two different information feedback models, with full and partial information, which vary in the difficulty of getting a sample of the external factor. Under both information feedback settings, we achieve two-way results: (1) For general problems, we show that our algorithm gets an

\widetilde O(T^{\alpha_u} + T^{\alpha_v} + T^{1/2})

regret against the fluid benchmark. Here,

\alpha_u

and

\alpha_v

reflect the complexity of the context and external factor distributions, respectively. This result is comparable to existing results. (2) When the fluid problem is linear programming with a unique and non-degenerate optimal solution, our algorithm leads to an

\widetilde O(1)

regret. To the best of our knowledge, this is the first

\widetilde O(1)

regret result in the CBwK problem regardless of information feedback models. We further use numerical experiments to verify our results.Comment: 43 pages, 2 figures, 1 tabl

arXiv.org e-Print Archive

Coordinated Dynamic Bidding in Repeated Second-Price Auctions with Budgets

Author: Chen Yurong
Chen Zhaohua
Deng Xiaotie
Duan Zhijian
Sun Haoran
Wang Qian
Yan Xiang
Publication venue
Publication date: 13/06/2023
Field of study

In online ad markets, a rising number of advertisers are employing bidding agencies to participate in ad auctions. These agencies are specialized in designing online algorithms and bidding on behalf of their clients. Typically, an agency usually has information on multiple advertisers, so she can potentially coordinate bids to help her clients achieve higher utilities than those under independent bidding. In this paper, we study coordinated online bidding algorithms in repeated second-price auctions with budgets. We propose algorithms that guarantee every client a higher utility than the best she can get under independent bidding. We show that these algorithms achieve maximal coalition welfare and discuss bidders' incentives to misreport their budgets, in symmetric cases. Our proofs combine the techniques of online learning and equilibrium analysis, overcoming the difficulty of competing with a multi-dimensional benchmark. The performance of our algorithms is further evaluated by experiments on both synthetic and real data. To the best of our knowledge, we are the first to consider bidder coordination in online repeated auctions with constraints.Comment: 43 pages, 12 figure

arXiv.org e-Print Archive

Is Nash Equilibrium Approximator Learnable?

Author: Deng Xiaotie
Du Yali
Duan Zhijian
Huang Wenhan
Wang Jun
Yang Yaodong
Zhang Dinghuai
Publication venue
Publication date: 22/01/2023
Field of study

In this paper, we investigate the learnability of the function approximator that approximates Nash equilibrium (NE) for games generated from a distribution. First, we offer a generalization bound using the Probably Approximately Correct (PAC) learning model. The bound describes the gap between the expected loss and empirical loss of the NE approximator. Afterward, we prove the agnostic PAC learnability of the Nash approximator. In addition to theoretical analysis, we demonstrate an application of NE approximator in experiments. The trained NE approximator can be used to warm-start and accelerate classical NE solvers. Together, our results show the practicability of approximating NE through function approximation.Comment: Accepted by AAMAS 202

arXiv.org e-Print Archive

UCL Discovery

Formal Analysis and Systematic Construction of Two-factor Authentication Scheme

Author: Duncan S. Wong
Guomin Yang
Huaxiong Wang
Xiaotie Deng
Publication venue: International Association for Cryptologic Research (IACR)
Publication date: 15/08/2006
Field of study

One of the most commonly used two-factor authentication mechanisms is based on smart card and user\u27s password. Throughout the years, there have been many schemes proposed, but most of them have already been found flawed due to the lack of formal security analysis. On the cryptanalysis of this type of schemes, in this paper, we further review two recently proposed schemes and show that their security claims are invalid. To address the current issue, we propose a new and simplified property set and a formal adversarial model for analyzing the security of this type of schemes. We believe that the property set and the adversarial model themselves are of independent interest. We then propose a new scheme and a generic construction framework. In particular, we show that a secure password based key exchange protocol can be transformed efficiently to a smartcard and password based two-factor authentication scheme provided that there exist pseudorandom functions and collision-resistant hash functions

Cryptology ePrint Archive

Dynamic Budget Throttling in Repeated Second-Price Auctions

Author: Cai Zheng
Chen Zhaohua
Deng Xiaotie
Pan Yuqi
Ren Yukun
Shi Zhuming
Tang Chuyue
Wang Chang
Wang Qian
Zhu Zhihua
Publication venue
Publication date: 14/07/2022
Field of study

Throttling is one of the most popular budget control methods in today's online advertising markets. When a budget-constrained advertiser employs throttling, she can choose whether or not to participate in an auction after the advertising platform recommends a bid. This paper focuses on the dynamic budget throttling process in repeated second-price auctions from a theoretical view. An essential feature of the underlying problem is that the advertiser does not know the distribution of the highest competing bid upon entering the market. To model the difficulty of eliminating such uncertainty, we consider two different information structures. The advertiser could obtain the highest competing bid in each round with full-information feedback. Meanwhile, with partial information feedback, the advertiser could only have access to the highest competing bid in the auctions she participates in. We propose the OGD-CB algorithm, which involves simultaneous distribution learning and revenue optimization. In both settings, we demonstrate that this algorithm guarantees an

O(\sqrt{T\log T})

regret with probability

1 - O(1/T)

relative to the fluid adaptive throttling benchmark. By proving a lower bound of

\Omega(\sqrt{T})

on the minimal regret for even the hindsight optimum, we establish the near optimality of our algorithm. Finally, we compare the fluid optimum of throttling to that of pacing, another widely adopted budget control method. The numerical relationship of these benchmarks sheds new light on the understanding of different online algorithms for revenue maximization under budget constraints.Comment: 29 pages, 1 tabl

arXiv.org e-Print Archive

Budget-Constrained Auctions with Unassured Priors

Author: Chen Zhaohua
Deng Xiaotie
Li Jicheng
Wang Chang
Yang Mingwei
Publication venue
Publication date: 31/03/2022
Field of study

In today's online advertising markets, it is common for an advertiser to set a long-period budget. Correspondingly, advertising platforms adopt budget control methods to ensure that any advertiser's payment is within her budget. Most budget control methods rely on value distributions of advertisers. However, due to the complex environment advertisers stand in and privacy issues, the platform hardly learns their true priors. Therefore, it is essential to understand how budget control auction mechanisms perform under unassured priors. This paper gives a two-fold answer. First, we propose a bid-discount method barely studied in the literature. We show that such a method exhibits desirable properties in revenue-maximizing and computation when fitting into first-price auction. Second, we compare this mechanism with another four in the prior manipulation model, where an advertiser can arbitrarily report a value distribution to the platform. These four mechanisms include the optimal mechanism satisfying budget-constrained IC, first-price/second-price mechanisms with the widely-studied pacing method, and an application of bid-discount in second-price mechanism. We consider three settings under the model, depending on whether the reported priors are fixed and advertisers are symmetric or not. We show that under all three cases, the bid-discount first-price auction we introduce dominates the other four mechanisms concerning the platform's revenue. For the advertisers' side, we show a surprising strategic-equivalence result between this mechanism and the optimal auction. Extensive revenue dominance and strategic relationships among these mechanisms are also revealed. Based on these findings, we provide a thorough understanding of prior dependency in repeated auctions with budgets. The bid-discount first-price auction itself may also be of further independent research interest.Comment: 47 pages, 2 figures, 1 table. In revie

arXiv.org e-Print Archive

Synthesis of graphene and graphene nanostructures by ion implantation and pulsed laser annealing

Author: Appleton Bill R.
Berke Kara
Elliman Robert
Fridmann Joel
Gila Brent P.
Hebard Arthur F.
Ren Fan
Rudawski N G
Venkatachalam Dinesh
Wang Xiaotie
Publication venue: 'AIP Publishing'
Publication date: 29/11/2018
Field of study

In this paper, we report a systematic study that shows how the numerous processing parameters associated with ion implantation (II) and pulsed laser annealing (PLA) can be manipulated to control the quantity and quality of graphene (G), few-layer graphene (FLG), and other carbon nanostructures selectively synthesized in crystalline SiC (c-SiC). Controlled implantations of Si− plus C− and Au + ions in c-SiC showed that both the thickness of the amorphous layer formed by ion damage and the doping effect of the implanted Au enhance the formation of G and FLG during PLA. The relative contributions of the amorphous and doping effects were studied separately, and thermal simulation calculations were used to estimate surface temperatures and to help understand the phase changes occurring during PLA. In addition to the amorphous layer thickness and catalytic doping effects, other enhancement effects were found to depend on other ion species, the annealing environment, PLA fluence and number of pulses, and even laser frequency. Optimum II and PLA conditions are identified and possible mechanisms for selective synthesis of G, FLG, and carbon nanostructures are discussed

The Australian National University